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Appendices 


Introduction 


Breast  cancer  is  one  of  the  most  common  neoplasms  in  women  and  is  a  leading  cause  of  cancer 
related  deaths  worldwide.  Improved  diagnostic  tools  have  made  it  possible  to  detect  breast 
cancers  at  early  stages  leading  to  a  significant  decrease  in  breast  cancer  mortality  rates  over  the 
past  decades.  However,  mortality  rates  of  advanced  stage  cancer  have  not  decreased  significantly 
due  to  lack  of  effective  therapies  and  approximately  25  %  of  breast  cancer  patients  will  die  of 
their  disease.  Therefore,  the  development  and  application  of  new  molecularly  based  diagnostic 
and  prognostic  tools  and  therapies  is  of  utmost  importance.  The  key  to  the  development  of  such 
rational  preventative  and  therapeutic  approaches  lies  in  the  identification  of  genes  and 
biochemical  pathways  involved  in  breast  tumorigenesis.  One  approach  to  the  discovery  of  novel 
diagnostic  and  prognostic  markers,  and  therapeutic  targets  is  to  compare  the  gene  expression 
profiles  of  normal  and  cancer  cells  and  identify  genes  or  subsets  of  genes  with  expression  levels 
that  correlate  with  tumor  stage  or  clinical  outcome.  Several  comprehensive  gene  expression 
profiling  studies  have  been  performed  in  breast  cancer  and  several  novel  putative  molecular 
markers  have  been  identified.  Most  of  these  studies  utilized  array  based  platforms,  and  therefore, 
were  inherently  limited  to  the  analysis  of  known  genes  and  ESTs.  SAGE  is  an  alternative 
comprehensive  gene  expression  profiling  technique  that  does  not  require  the  a  priori  knowledge 
of  the  transcripts  present  in  the  cells.  Thus,  it  allows  for  the  identification  of  novel  transcripts 
making  it  particularly  suitable  for  the  discovery  of  new  molecular  targets. 

In  this  study  we  utilized  the  SAGE  technology  to  determine  the  comprehensive  gene  expression 
profiles  of  normal  breast  tissue  and  breast  carcinomas  of  all  clinical  stages  with  the  aim  of 
identifying  genes  involved  in  the  initiation  and  progression  of  breast  tumorigenesis.  Based  on 
this  analysis  we  identified  a  novel  gene,  IBC-1  (Invasive  Breast  Carcinoma- 1),  encoding  a  small 
secreted  protein  that  is  only  expressed  in  a  subset  of  invasive  breast  carcinomas,  in  the  pons  of 
the  brain,  and  in  hippocampal  neurons  following  oxidative  insult,  but  not  in  any  other  normal 
human  tissue.  IBC-1  encodes  a  novel  growth  and  survival  factor  that  is  overexpressed  in  a 
significant  fraction  of  invasive  breast  carcinomas  with  poor  prognostic  features.  IBC-1  has  no 
significant  homology  to  known  proteins,  therefore  it  may  be  involved  in  a  novel-signaling 
pathway.  Based  on  amino  acid  sequence  IBC-1  corresponds  to  the  cancer-associated  cachexia- 
inducing  factor  previously  identified  by  others. 
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•  We  developed  highly  sensitive  ELISA  assays  using  the  different  IBC-1  antibodies  and 
determined  that  IBC-1  can  be  detected  in  the  sera  of  breast  cancer  patients.  We  are 
currently  collecting  a  larger  set  of  samples  to  be  tested  in  collaboration  with  Drs. 

Lyndsay  Harris  (DFCI)  and  (BU). 

•  We  have  generated  a  series  of  constitutive  and  inducible  mammalian  expression 
constructs,  adenoviruses,  and  retroviruses  expressing  the  human  IBC-1  protein. 

•  We  have  determined  that  high  and  low  affinity  IBC-1  receptors  are  present  on  the  cell 
surface  of  breast  carcinomas  and  neurons  of  the  brain.  The  presence  of  the  high  affinity 
receptor  appears  to  correlate  with  cellular  response  to  IBC-1. 

•  The  expression  or  exogenous  addition  of  IBC-1  to  21  NT  and  SUM-52  breast  cancer  cell 
lines  enhances  cell  proliferation  and  survival.  We  have  also  tested  5  additional  breast 
cancer  and  immortalized  cell  lines,  but  IBC-1  had  no  effect  on  the  growth  of  these  other 
cells.  Using  ligand  binding  assays  we  determined  that  this  could  be  due  to  the  lack  of  a 
high-affinity  IBC-1  receptor  on  these  cells. 

•  We  determined  that  IBC-1  has  no  effect  on  the  3D  growth  and  invasion  of  21NT  and 
SUM52  cells  (the  cell  lines  that  respond  to  IBC-1). 

•  We  developed  a  TET-OFF  inducible  IBC-1  expression  system  in  the  MCFDCIS.com  cell 
line  and  performed  xenograft  assays  in  nude  mice.  Based  on  one  experiment  we  did  not 
see  an  effect  on  tumor  growth,  but  it  appears  that  this  cell  line  does  not  respond  to  IBC- 1 
in  vitro  neither.  Unfortunately  the  21NT  and  SUM52  cell  lines  are  poorly  tumorigenic, 
thus  we  have  not  been  able  to  use  them  for  xenograft  experiments.  We  also  tested  if  co¬ 
injection  of  human  breast  cancer  stroma  would  enhance  the  tumorigenic  potential  of 
21NT  cells,  but  even  using  this  approach  we  have  not  been  able  to  generate  xenografts. 

•  In  collaboration  with  Dr.  Giulio  Passinetti  (Mount  Sinai,  NY)  we  further  characterized 
the  possible  involvement  of  IBC-1  in  neurodegenerative  disease.  The  results  of  these 
studies  will  be  submitted  for  publication  in  the  near  future. 


Reportable  outcomes; 

Please  see  attached  manuscript  published  in  Proc.  Natl.  Acad.  Sci..  USA  reporting  the  results  of 
the  IBC-1  study  and  an  additional  manuscript  resulting  from  the  work  of  Dr.  Allinen  currently  in 
press  in  Cancer  Cell. 

Antibodies  generated: 

We  have  generated  several  different  (2  N-terminal  peptide  and  1  C  terminal  peptide)  rabbit 
polyclonal  antibodies  against  the  human  IBC-1  protein  and  a  variant  of  IBC-1  that  is  generated 
by  alternative  splicing  and  has  a  different  C-terminus. 

Cell  lines  generated: 
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We  have  generated  several  human  cell  lines  that  constitutively  or  inducibly  overexpress  the 
human  IBC-1  protein  or  variant  of  IBC-1. 


Conclusions: 

In  summary,  we  determined  that  IBC-1 /DCD  is  a  novel  growth  and  survival  factor  that  is 
overexpressed  in  approximately  10  %  of  primary  invasive  breast  carcinomas  and  its 
overexpression  at  least  in  some  cases  is  associated  with  a  gain  of  its  locus  at  12ql3.1.  Based  on 
its  function  and  restricted  expression  pattern  in  normal  adult  tissues,  IBC-1 /DCD  is  a  candidate 
cancer  therapeutic  target.  The  secreted  nature  and  extracellular  mechanism  of  DCD  action  makes 
it  even  more  attractive  for  such  a  purpose. 

Neurons  are  particularly  sensitive  to  reactive  oxygen  species  (ROS)  whereas  tumor  cells 
themselves  produce  large  amounts  of  ROS.  Therefore,  the  high  expression  of  IBC-l/DCD  in 
these  cell  types  may  be  essential  for  their  survival.  Thus,  therapeutic  activation  of  the  IBC- 
l/DCD  signaling  pathway  may  be  beneficial  in  certain  neurodegenerative  diseases  involving 
catecholaminergic  neurons  such  as  Parkinson’s  disease,  while  its  therapeutic  inhibition  may  be  an 
effective  treatment  of  tumors  with  DCD  expression. 


6 


References: 


1.  Greenlee,  R.  T.,  Murray,  T.,  Bolden,  S.  &  Wingo,  P.  A.  (2000)  CA  Cancer  J  Clin  50,  7- 
33. 

2.  Perou,  C.  M.,  Sorlie,  T.,  Eisen,  M.  B.,  van  de  Rijn,  M.,  Jeffrey,  S.  S.,  Rees,  C.  A.,  Pollack, 
J.  R.,  Ross,  D.  T.,  Johnsen,  H.,  Akslen,  L.  A.,  Fluge,  O.,  Pergamenschikov,  A.,  Williams, 
C.,  Zhu,  S.  X.,  Lonning,  P.  E.,  Borresen-Dale,  A.  L.,  Brown,  P.  O.  &  Botstein,  D.  (2000) 
Nature  406,  747-52. 

3.  Sorlie,  T.,  Perou,  C.  M.,  Tibshirani,  R.,  Aas,  T.,  Geisler,  S.,  Johnsen,  H.,  Hastie,  T.,  Eisen, 
M.  B.,  van  de  Rijn,  M.,  Jeffrey,  S.  S.,  Thorsen,  T.,  Quist,  H.,  Matese,  J.  C.,  Brown,  P.  O., 
Botstein,  D.,  Eystein  Lonning,  P.  &  Borresen-Dale,  A.  L.  (2001)  Proc  Natl  Acad  Sci  U  S 
A  98,  10869-74. 

4.  van  de  Vijver,  M.  J.,  He,  Y.  D.,  van't  Veer,  L.  J.,  Dai,  H.,  Hart,  A.  A.,  Voskuil,  D.  W., 
Schreiber,  G.  J.,  Peterse,  J.  L.,  Roberts,  C.,  Marton,  M.  J.,  Parrish,  M.,  Atsma,  D., 
Witteveen,  A.,  Glas,  A.,  Delahaye,  L.,  van  der  Velde,  T.,  Bartelink,  H.,  Rodenhuis,  S., 
Rutgers,  E.  T.,  Friend,  S.  H.  &  Bernards,  R.  (2002)  N  Engl  J  Med  347,  1999-2009. 

5.  van 't  Veer,  L.  J.,  Dai,  H.,  van  de  Vijver,  M.  J.,  He,  Y.  D.,  Hart,  A.  A.,  Mao,  M.,  Peterse, 
H.  L.,  van  der  Kooy,  K.,  Marton,  M.  J.,  Witteveen,  A.  T.,  Schreiber,  G.  J.,  Kerkhoven,  R. 
M.,  Roberts,  C.,  Linsley,  P.  S.,  Bernards,  R.  &  Friend,  S.  H.  (2002)  Nature  415,  530-6. 

6.  Porter,  D.,  Lahti-Domenici,  J.,  Keshaviah,  A.,  Bae,  Y.  K.,  Argani,  P.,  Marks,  J., 
Richardson,  A.,  Cooper,  A.,  Strausberg,  R.,  Riggins,  G.  J.,  Schnitt,  S.,  Gabrielson,  E., 
Gelman,  R.  &  Polyak,  K.  (2003)  Mol  Cancer  Res  1,  362-75. 

7.  Velculescu,  V.  E.,  Zhang,  L.,  Vogelstein,  B.  &  Kinzler,  K.  W.  (1995)  Science  270,  484-7. 

8.  Polyak,  K.  &  Riggins,  G.  J.  (2001)  J  Clin  Oncol  19,  2948-58. 

9.  Kononen,  J.,  Bubendorf,  L.,  Kallioniemi,  A.,  Barlund,  M.,  Schraml,  P.,  Leighton,  S., 
Torhorst,  J.,  Mihatsch,  M.  J.,  Sauter,  G.  &  Kallioniemi,  O.  P.  (1998)  Nat  Med  4,  844-7. 

10.  Hulette,  C.  M.,  Welsh-Bohmer,  K.  A.,  Crain,  B.,  Szymanski,  M.  H.,  Sinclaire,  N.  O.  & 
Roses,  A.  D.  (1997)  Arch  Pathol  Lab  Med  121,  615-8. 

11.  Krop,  I.  E.,  Sgroi,  D.,  Porter,  D.  A.,  Lunetta,  K.  L.,  LeVangie,  R.,  Seth,  P.,  Kaelin,  C.  M., 
Rhei,  E.,  Bosenberg,  M.,  Schnitt,  S.,  Marks,  J.  R.,  Pagon,  Z.,  Belina,  D.,  Razumovic,  J.  & 
Polyak,  K.  (2001)  Proc  Natl  Acad  Sci  USA  98,  9796-9801. 

12.  St  Croix,  B.,  Rago,  C.,  Velculescu,  V.,  Traverso,  G.,  Romans,  K.  E.,  Montgomery,  E., 
Lai,  A.,  Riggins,  G.  J.,  Lengauer,  C.,  Vogelstein,  B.  &  Kinzler,  K.  W.  (2000)  Science  289, 
1197-202. 

13.  Qian,  Y.,  Fritzsch,  B.,  Shirasawa,  S.,  Chen,  C.  L.,  Choi,  Y.  &  Ma,  Q.  (2001)  Genes  Dev 
15,  2533-45. 

14.  Kuchinka,  B.  D.,  Kalousek,  D.  K.,  Lomax,  B.  L.,  Harrison,  K.  J.  &  Barrett,  I.  J.  (1995) 
Mod  Pathol  8, 183-6. 

15.  Pinkel,  D.,  Segraves,  R.,  Sudar,  D.,  Clark,  S.,  Poole,  I.,  Kowbel,  D.,  Collins,  C.,  Kuo,  W. 

L. ,  Chen,  C.,  Zhai,  Y.,  Dairkee,  S.  H.,  Ljung,  B.  M.,  Gray,  J.  W.  &  Albertson,  D.  G. 
(1998)  Nat  Genet  20,  207-11. 

16.  Flanagan,  J.  G.  &  Leder,  P.  (1990)  Cell  63, 185-194. 

17.  Lai,  A.,  Lash,  A.  E.,  Altschul,  S.  F.,  Velculescu,  V.,  Zhang,  L.,  McLendon,  R.  E.,  Marra, 

M.  A.,  Prange,  C.,  Morin,  P.  J.,  Polyak,  K.,  Papadopoulos,  N.,  Vogelstein,  B.,  Kinzler,  K. 
W.,  Strausberg,  R.  L.  &  Riggins,  G.  J.  (1999)  Cancer  Res  59,  5403-7. 


7 


18.  Porter,  D.  A.,  Krop,  I.  E.,  Nasser,  S.,  Sgroi,  D.,  Kaelin,  C.  M.,  Marks,  J.  R.,  Riggins,  G.  & 
Polyak,  K.  (2001)  Cancer  Res  61,  5697-702. 

19.  Burge,  C.  &  Karlin,  S.  (1997 )J  Mol  Biol  268,  78-94. 

20.  Cariuk,  P.,  Lorite,  M.  J.,  Todorov,  P.  T.,  Field,  W.  N.,  Wigmore,  S.  J.  &  Tisdale,  M.  J. 
(1997)  Br  J  Cancer  16,  606-13. 

21.  Cunningham,  T.  J.,  Hodge,  L.,  Speicher,  D.,  Reim,  D.,  Tyler-Polsz,  C.,  Levitt,  P., 
Eagleson,  K.,  Kennedy,  S.  &  Wang,  Y.  (1998)  J Neurosci  18,  7047-60. 

22.  Schittek,  B.,  Hipfel,  R.,  Sauer,  B.,  Bauer,  J.,  Kalbacher,  H.,  Stevanovic,  S.,  Schirle,  M., 
Schroeder,  K.,  Blin,  N.,  Meier,  F.,  Rassner,  G.  &  Garbe,  C.  (2001)  Nat  Immunol  2,  1133- 
7. 

23.  Spitz,  D.  R.,  Sim,  J.  E.,  Ridnour,  L.  A.,  Galoforo,  S.  S.  &  Lee,  Y.  J.  (2000)  Ann  N  Y Acad 
Sci  899,  349-62. 

24.  Sanghi,  S.,  Kumar,  R.,  Lumsden,  A.,  Dickinson,  D.,  Klepeis,  V.,  Trinkaus-Randall,  V., 
Frierson,  H.  F.,  Jr.  &  Laurie,  G.  W.  (2001)  J  Mol  Biol  310,  127-39. 

25.  McDevitt,  T.  M.  &  Tisdale,  M.  J.  (1992)  BrJ  Cancer  66,  815-20. 

26.  Todorov,  P.  T.,  McDevitt,  T.  M.,  Cariuk,  P.,  Coles,  B.,  Deacon,  M.  &  Tisdale,  M.  J. 
(1996)  Cancer  Res  56,  1256-61. 

27.  Todorov,  P.,  Cariuk,  P.,  McDevitt,  T.,  Coles,  B.,  Fearon,  K.  &  Tisdale,  M.  (1996)  Nature 
379,  739-42. 

28.  Cunningham,  T.  J.,  Jing,  H.,  Akerblom,  I.,  Morgan,  R.,  Fisher,  T.  S.  &  Neveu,  M.  (2002) 
Exp  Neurol  177,  32-9. 

29.  Szatrowski,  T.  P.  &  Nathan,  C.  F.  (1991)  Cancer  Res  51,  794-8. 


Appendices; 

1.  Porter  D,  Weremowicz  S,  Koei  Chin,  Seth  P,  Keshaviah  A,  Lahti-Domenici  J,  Bae  YK, 
Monitto  CL,  Merlos-Suarez  A,  Chan  J,  Hulette  CM,  Richardson  A,  Morton  CC,  Marks  J, 
Duyao  M,  Hruban  R,  Gabrielson  E,  Gelman  R,  and  Polyak  K.  A  neural  growth  factor  is  a 
candidate  oncogene  in  breast  cancer.  Proc  Natl  Acad  Sci.  2003;  100:10931-10936. 

2.  Allinen  M,  Beroukhim  R,  Cai,  L,  Brennan  C,  Lahti-Domenici  J,  Huang  H,  Porter  D,  Hu 
M,  Chin  L,  Richardson  A,  Schnitt  S,  Sellers  W,  Polyak  K.  Molecular  characterization  of 
the  tumor  microenvironment  in  breast  cancer.  Cancer  Cell  2004;  In  press 

3.  Seth  P,  Krop  IE,  Porter  DA  and  Polyak  K.  Novel  estrogen  and  tamoxifen  induced  genes  identified 
by  SAGE  (Serial  Analysis  of  Gene  Expression).  Oncogene  2002;  21:836-843. 


8 


APPENDIX  1 


#%• 


'.'A-*  &&&%* 


A  neural  survival  factor  is  a  candidate  oncogene  in 
breast  cancer 

Dale  Porterab,  Stanislawa  Weremowiczcd,  Koei  Chine,  Pankaj  Sethab,  Aparna  Keshaviahf,  Jaana  Lahti-Domenicia, 
Young  Kyung  Bae9,  Constance  L.  Monittoh,  Ana  Merlos-Suarezab,  Jennifer  Chancd,  Christine  M.  Hulette', 

Andrea  Richardsoncd,  Cynthia  C.  Mortond'<,  Jeffrey  Marks*,  Mabel  Duyaok,  Ralph  Hruban9,  Edward  Gabrielson9, 
Rebecca  Gelmanfl,  and  Kornelia  Polyaka  b  m 

Departments  of  aMedical  Oncology  and  fBiostatistics,  Dana-Farber  Cancer  Institute,  Department  of  "Pathology  and  ^Obstetrics,  Gynecology,  and 
Reproductive  Biology,  Brigham  and  Women's  Hospital,  and  Departments  of  bMedicine  and  dPathology,  Harvard  Medical  School  and  'Harvard 
School  of  Public  Health,  Boston,  MA  02115;  Comprehensive  Cancer  Center,  University  of  California,  San  Francisco,  CA  94143;  Departments 
of  ^Pathology  and  hAnesthesiology  and  Critical  Care  Medicine,  Johns  Hopkins  University  School  of  Medicine,  Baltimore,  MD  21231; 

•Departments  of  Medicine  and  Pathology,  Duke  University  Medical  Center,  Durham,  NC  27710;  and  kArdais  Corporation, 

Lexington,  MA  02421 

Edited  by  Bert  Vogelstein,  The  Sidney  Kimmel  Comprehensive  Cancer  Center  at  Johns  Hopkins,  Baltimore,  MD,  and  approved  July  23,  2003 
(received  for  review  May  17,  2003) 


Using  serial  analysis  of  gene  expression  (SAGE),  we  identified  a 
SAGE  tag  that  was  present  only  in  invasive  breast  carcinomas  and 
their  lymph  node  metastases.  The  transcript  corresponding  to  this 
SAGE  tag,  dermcidin  (DCD),  encodes  a  secreted  protein  normally 
expressed  only  in  the  pons  of  the  brain  and  sweat  glands.  Array 
comparative  genomic  hybridization,  fluorescence  in  situ  hybridiza¬ 
tion,  and  immunohistochemical  analyses  determined  that  DCD  is 
overexpressed  in  «10%  of  invasive  breast  carcinomas;  in  some 
cases  its  overexpression  is  coupled  with  a  focal  copy  number  gain 
of  its  locus  at  12q13.1,  and  its  expression  is  associated  with 
advanced  clinical  stage  and  poor  prognosis.  Expression  of  DCD  in 
breast  cancer  cells  promotes  cell  growth  and  survival  and  reduces 
serum  dependency.  Putative  high-  and  low-affinity  receptors  for 
DCD  are  present  on  the  cell  surface  of  breast  carcinomas  and 
neurons  of  the  brain.  Based  on  these  data  we  hypothesize  that  DCD 
may  play  a  role  in  tumorigenesis  by  means  of  enhancing  cell 
growth  and  survival  in  a  subset  of  breast  carcinomas. 

Breast  cancer  is  one  of  the  most  common  neoplasms  in  women 
and  is  a  leading  cause  of  cancer-related  deaths  worldwide. 
Improved  diagnostic  tools  have  made  it  possible  to  detect  breast 
cancers  at  early  stages,  leading  to  a  significant  decrease  in  breast 
cancer  mortality  rates  over  the  past  decades  (1).  However, 
mortality  rates  of  advanced-stage  cancer  have  not  decreased 
significantly  because  of  a  lack  of  effective  therapies,  and  ^25% 
of  breast  cancer  patients  will  die  of  the  disease  (1).  Therefore, 
the  development  and  application  of  new  molecularly  based 
diagnostic  and  prognostic  tools  and  therapies  are  of  utmost 
importance.  The  key  to  the  development  of  such  rational  pre¬ 
ventive  and  therapeutic  approaches  lies  in  the  identification  of 
genes  and  biochemical  pathways  involved  in  breast  tumorigen¬ 
esis.  One  approach  to  the  discovery  of  novel  diagnostic  and 
prognostic  markers  and  therapeutic  targets  is  to  compare  the 
gene  expression  profiles  of  normal  and  cancer  cells  and  identify 
genes  or  subsets  of  genes  with  expression  levels  that  correlate 
with  tumor  stage  or  clinical  outcome.  Several  comprehensive 
gene  expression  profiling  studies  have  been  performed  in  breast 
cancer,  and  several  novel  putative  molecular  markers  have  been 
identified  (2-6).  Most  of  these  studies  used  array-based  plat¬ 
forms  and,  therefore,  were  inherently  limited  to  the  analysis  of 
known  genes  and  ESTs.  Serial  analysis  of  gene  expression 
(SAGE)  is  an  alternative  comprehensive  gene  expression  pro¬ 
filing  technique  that  does  not  require  the  a  priori  knowledge  of 
the  transcripts  present  in  the  cells.  Thus,  it  allows  for  the 
identification  of  novel  transcripts,  making  it  particularly  suitable 
for  the  discovery  of  new  molecular  targets  (7,  8). 

In  this  study  we  used  the  SAGE  technology  to  determine  the 
comprehensive  gene  expression  profiles  of  normal  breast  tissue 
and  breast  carcinomas  of  all  clinical  stages  with  the  aim  of 
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identifying  genes  involved  in  the  initiation  and  progression  of 
breast  tumorigenesis.  This  approach  led  to  the  identification  of 
a  previously  uncharacterized  growth  and  survival  factor  that  is 
overexpressed  in  a  significant  fraction  of  invasive  breast  carci¬ 
nomas  with  poor  prognostic  features. 

Methods 

Cell  Lines  and  Tissue  Specimens.  Breast  cancer  cell  lines  were 
obtained  from  American  Type  Culture  Collection  or  were 
generously  provided  by  Steve  Ethier  (University  of  Michigan, 
Ann  Arbor),  Gail  Tomlinson  (University  of  Texas,  Austin),  and 
Arthur  Pardee  (Dana-Farber  Cancer  Institute).  Cells  were 
grown  in  media  recommended  by  the  provider.  Tumor  speci¬ 
mens  were  obtained  from  Brigham  and  Women’s  and  Massa¬ 
chusetts  General  Hospitals  (Boston),  Duke  University,  Univer¬ 
sity  Hospital  Zagreb  (Zagreb,  Croatia),  and  the  National 
Disease  Research  Interchange,  snap  frozen  on  dry  ice,  and 
stored  at  -80°C  until  use.  All  human  tissue  was  collected  using 
protocols  approved  by  the  institutional  review  boards.  Tissue 
microarrays  were  (i)  obtained  from  Imgenex  (San  Diego), 
Ambion  (Austin,  TX),  Ardais  Corporation,  and  Gentaur  (Brus¬ 
sels);  (ii)  provided  by  the  Cooperative  Breast  Cancer  Tissue 
Resource;  and  (i/7)  generated  at  Johns  Hopkins  University  and 
at  Beth  Israel  Deaconess  Medical  Center  following  published 
protocols  (9).  Brain  samples  were  collected  from  autopsies 
performed  at  Brigham  and  Women’s  Hospital  and  from  subjects 
prospectively  enrolled  in  the  Rapid  Autopsy  Program  of  the 
Joseph  and  Kathleen  Price  Bryan  Alzheimer  Disease  Research 
Center  at  Duke  University  Medical  Center  (10). 

RNA  Preparation,  mRNA  in  Situ  Hybridization,  and  Northern  Blot 
Analysis.  We  performed  RNA  isolation,  RT-PCR,  and  Northern 
blot  analyses  as  described  (11).  We  performed  mRNA  in  situ 
hybridization  using  paraffin  sections  and  digitonin-labeled  ribo- 
probes  following  a  protocol  developed  by  St.  Croix  et  al.  (12),  and 
we  hybridized  frozen  sections  as  described  with  minor  modifi¬ 
cations  (13). 

Dermcidin  (DCD)  Expression  in  Mammalian  Cells  and  Growth  and 
Survival  Assays.  We  generated  an  N-terminal  alkaline  phospha¬ 
tase  (AP)  C-terminal  DCD  fusion  protein  using  the  AP-TAG-5 


This  paper  was  submitted  directly  {Track  II)  to  the  PNAS  office. 
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IBC-1,  invasive  breast  cancer  1;  ROS,  reactive  oxygen  species;  SAGE,  serial  analysis  of  gene 
expression. 
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expression  vector  (GenHunter,  Nashville,  TN).  We  transfected 
mammalian  cells  with  FuGENE6  (Roche),  Lipofectamine,  or 
Lipofectamine  2000  (Life  Technologies,  Rockville,  MD)  re¬ 
agents.  For  mammalian  expression,  we  subcloned  the  human 
DCD  cDNA  into  the  pBabe  construct  and  confirmed  DCD 
protein  expression  by  immunoblot  analysis.  To  determine  the 
effect  of  DCD  expression  on  cell  growth,  we  plated  5,000  control 
(pBabe)  or  DCD-expressing  (pBabe-DCD)  cells  per  well  in  a 
24-well  plate,  and  21NT  cells  were  grown  in  either  complete 
MCF10A  medium  (American  Type  Culture  Collection)  or 
MCF10A  medium  diluted  1:10  with  basal  MCF10A  medium 
without  growth  factors  added.  Cells  were  counted  (three  wells 
per  time  point)  on  days  1, 3, 5,  and  7  after  plating.  For  menadione 
survival  assays,  21NT  pBabe  and  21NT  pBabe-DCD  stable  pools 
were  plated  (105  cells  per  well  in  a  24-well  plate).  At  6  h,  cells 
were  washed  and  medium  was  changed  to  serum-free  DMEM- 
F12  medium  with  or  without  menadione  (0,  100,  and  200  ptm; 
three  wells  per  treatment),  and  cells  were  counted  at  24  h.  The 
experiment  was  repeated  three  times.  For  glucose  deprivation 
assays,  21NT  pBabe  and  21NT  pBabe-DCD  stable  pools  were 
plated  (5  X  104  cells  per  well  in  a  24-well  plate).  At  6  h,  cells  were 
washed  and  medium  was  changed  to  basal  media  with  5%  or 
0.5%  horse  serum  and  0  or  4  mM  glucose  (three  wells  per 
treatment),  and  cells  were  counted  at  48  h.  The  experiment  was 
repeated  three  times. 

Fluorescence  in  Situ  Hybridization  (FISH)  and  Array  Comparative 
Genomic  Hybridization  (CGH).  We  obtained  bacterial  artificial  chro¬ 
mosomes  (BACs)  containing  the  human  DCD,  CDK4,  SASy  GLI , 
and  MDM2  genes  from  Research  Genetics  (Huntsville,  AL).  The 
dl223  probe  for  identification  of  chromosome  12  was  obtained 
from  Vysis  (Naperville,  IL).  We  performed  FISH  to  paraffin- 
embedded  or  frozen  tissue  and  metaphase  chromosomes  from 
normal  human  lymphocytes  as  described  (14).  We  performed 
BAC  array  CGH  essentially  as  described  (15).  DNA  copy 
number  variations  that  deviated  significantly  (at  least  three  times 
higher  than  the  standard  deviation  of  the  overall  fluorescence 
intensity  of  the  tumor  DNA)  from  background  ratios  measured 
in  normal  genomic  DNA  control  hybridizations  were  considered 
real  copy  number  variations.  In  the  case  of  the  BAC  containing 
DCD,  the  average  log  fluorescence  ratio  was  0.3.  The  detailed 
results  of  the  array  CGH  analysis  of  the  152  breast  tumors  will 
be  reported  elsewhere. 

Antibodies  and  Immunoblot  Immunohistoehemical,  and  Statistical 
Analyses.  We  generated  an  affinity-purified  polyclonal  anti-DCD 
antibody  against  a  synthetic  peptide  (RQAPKPRKQRSS)  cor¬ 
responding  to  amino  acids  53-64  of  the  human  DCD  protein 
(Zymed),  and  other  antibodies  used  were  obtained  from  sources 
previously  described  (6).  We  performed  immunoblot  analyses  as 
described  (11).  We  analyzed  the  expression  of  DCD  in  primary 
tumors  by  the  use  of  immunohistochemistry  to  tissue  microarrays 
that  contained  evaluatable  paraffin-embedded  specimens  de¬ 
rived  from  ductal  carcinoma  in  situ ,  primary  invasive  breast 
cancer  and  distant  breast  cancer  metastases,  pancreatic,  gastric, 
prostate,  kidney,  and  colon  carcinomas,  melanomas,  lympho¬ 
mas,  and  gliomas.  Immunohistoehemical  and  statistical  analyses 
were  performed  as  described  (6). 

Ligand  Binding  Assays.  We  performed  in  vivo  and  in  vitro  ligand 
binding  assays  on  primary  tissues  and  cell  lines  using  AP-DCD 
essentially  as  described  (16).  Briefly,  we  fixed  frozen  sections  of 
various  human  specimens,  incubated  with  either  AP-DCD  fusion 
protein  or  AP  control-conditioned  medium,  rinsed,  and  then 
incubated  with  AP  substrate  forming  a  blue/purple  precipitate. 
For  in  vitro  assays  we  incubated  cells  in  suspension  with  condi¬ 
tioned  medium  containing  either  AP  alone  or  AP-DCD  fusion 
protein,  rinsed,  and  then  assayed  for  bound  AP  activity. 


Results 

Identification  of  Invasive  Breast  Cancer  1  (IBC-1)/DCD.  To  identify 
genes  implicated  in  breast  tumorigenesis  we  determined  the  gene 
expression  profiles  of  normal  mammary  epithelial  cells  and  in 
situ ,  invasive,  and  metastatic  breast  carcinomas  using  SAGE. 
Using  this  approach  we  identified  a  SAGE  tag  with  no  database 
match  that  was  highly  expressed  only  in  a  subset  of  invasive 
breast  carcinomas  (17,  18)  designated  IBC-1.  Searching  the 
human  genome  sequence  with  the  IBC-1  SAGE  tag  and  5'  Nlalll 
site  (5' -CATGACGTTA A AGAC-3 ' ),  we  identified  a  genomic 
clone  containing  this  tag  and  predicted  (19)  that  it  encodes  a 
transcribed  gene  composed  of  five  exons  (Fig.  L4).  Confirming 
the  restricted  expression  pattern  suggested  by  SAGE,  based  on 
Northern  blot  hybridization  IBC-1  was  expressed  in  only  two 
regions  of  the  brain:  in  the  pons  and,  at  a  lower  level,  in  the 
paracentral  gyrus  of  the  cerebral  cortex,  and  not  in  75  other 
normal  human  adult  and  fetal  tissues  (Fig.  LB).  The  predicted 
IBC-1  gene  encodes  a  110-aa  protein  with  limited  homology  to 
lacritin  and  an  EST  containing  an  N-terminal  signal  peptide 
(Fig.  1  C  and  D).  Further  database  searches  using  the  predicted 
IBC-1  protein  sequence  revealed  that  IBC-1  nearly  matches  a 
20-aa  peptide  derived  from  the  mouse  proteolysis-inducing 
factor  or  cachectic  factor,  and  exactly  matches  a  30-aa  neural 
survival-promoting  peptide  (20,  21)  (Fig.  IB).  While  this  work 
was  in  progress  another  group  independently  identified  a  cDNA 
from  human  sweat  glands  identical  to  IBC-1  and  named  it  DCD 
(22).  Thus,  to  avoid  confusion  due  to  multiple  gene  names,  we 
renamed  IBC-1  as  DCD. 

Expression  of  DCD  in  Breast  Carcinomas  and  Correlation  with  His¬ 
topathologic  Features.  Next,  we  analyzed  the  expression  of  DCD 
in  normal  breast  organoids,  primary  breast  carcinomas,  and 
breast  cancer  cell  lines  by  Northern  blot,  RT-PCR,  and  mRNA 
in  situ  hybridization  analyses  and  determined  that  it  was  ex¬ 
pressed  only  in  a  subset  of  breast  cancer  cell  lines  and  primary 
tumors  (Fig.  2  A-C  and  data  not  shown).  To  determine  the 
expression  of  DCD  at  the  cellular  level  we  performed  mRNA  in 
situ  hybridization.  Intense  red  or  black  (depending  on  hybrid¬ 
ization  protocol  used)  staining  demonstrates  that  DCD  is  ex¬ 
pressed  in  tumor  cells  and  not  in  stromal  cells  (Fig.  2C).  No  signal 
was  observed  in  adjacent  normal  mammary  epithelial  cells  (Fig. 
2C).  In  tumors  15  and  238  only  a  subset  of  cells  showed  high 
DCD  expression  indicating  intratumoral  heterogeneity  (Fig.  2C). 

To  evaluate  the  expression  of  the  DCD  protein  we  performed 
immunohistoehemical  analysis  of  several  tissue  microarrays 
composed  of  breast  carcinomas  (Fig.  2D).  Correlating  with  our 
SAGE  results  we  detected  DCD  expression  in  primary  invasive 
breast  carcinomas  (48/558),  and  rarely  in  ductal  carcinoma  in 
situ  (1/70)  or  distant  metastases  (1/49).  Statistical  analysis 
determined  that  DCD-positive  breast  tumors  were  more  likely  to 
be  of  advanced  stage  (tumor  node  metastasis  stage  2  or  3,  mostly 
due  to  higher  T  and  N,  P  ~  0.007)  indicating  that  DCD 
expression  correlates  with  larger  tumor  size  and  with  the  pres¬ 
ence  of  metastatic  lymph  nodes.  Because  both  of  these  tumor 
characteristics  are  known  to  predict  a  bad  prognosis,  we  analyzed 
DCD  expression  in  relation  to  overall  and  distant  metastasis-free 
survival  in  a  subset  of  breast  tumors  with  clinical  follow-up  data. 
Patients  with  DCD-positive  tumors  appeared  to  have  decreased 
overall  and  distant  metastasis-free  survival,  but  this  decrease  did 
not  reach  statistical  significance  (data  not  shown). 

We  also  analyzed  DCD  expression  in  multiple  human  tumor 
types  and  found  that  2/64  pancreatic  carcinomas  expressed 
DCD.  Thus,  DCD  overexpression  may  occur  in  other  human 
tumor  types,  but  the  determination  of  this  will  require  the 
examination  of  large  tumor  sets  from  each  tumor  type.  Although 
the  staining  of  melanomas  did  not  detect  any  DCD-positive 
tumor  cells,  adjacent  sweat  glands  of  the  skin  were  strongly 
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of  cerebral  cortex 

Fig.  1.  DCD  and  its  homologues.  (>4)  Genomic  structure  of  the  human  IBC-1/DCD  gene.  Exon-intron  boundaries,  start-and-stop  codons,  and  the  SAGE  tag  that 
led  to  the  identification  of  IBC-1/DCD  are  indicated.  (B)  Evaluation  of  DCD  expression  in  76  human  adult  and  fetal  tissues  on  a  dot-blot  expression  array.  High 
level  of  expression  was  detected  only  in  the  pons  of  the  brain,  whereas  low-level  expression  was  seen  in  the  paracentral  gyrus  of  cerebral  cortex.  (0  Human 
IBC-1  /DCD  cDNA  and  predicted  amino  acid  sequence.  Sequences  of  the  peptides  derived  from  cachectic  factor  and  survival  peptide  are  indicated  by  thick  and 
thin  underlining,  respectively.  An  arrow  marks  the  predicted  secretory  signal  peptidase  cleavage  site.  (0)  Amino  acid  alignment  of  DCD,  lacritin,  and  EST-AI12471 
proteins.  Amino  acids  identical  to  the  consensus  are  shaded  in  gray.  Comparison  was  made  by  using  DNAStar  and  the  clustal  algorithm. 


DCD-positive  (Fig.  2D),  confirming  DCD  expression  in  sweat 
glands  (22). 

Despite  an  extensive  analysis  of  cell  lines  from  various  tumor 
types,  we  were  not  able  to  identify  a  cell  line  that  endogenously 


expresses  the  DCD  protein  at  levels  detectable  by  Western  blot 
analysis  using  our  antibody  (data  not  shown).  Thus,  to  confirm 
that  the  DCD  transcript  we  identified  encodes  a  protein  that 
exists  in  vivo ,  we  performed  immunoblot  analysis  of  DCD- 
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Fig.  2.  Expression  of  DCD  in  normal  and  cancerous  tissues.  (A)  Northern  blot  analyses  of  normal  breast  organoids,  breast  cancer  cell  lines,  primary  breast 
carcinomas,  and  corresponding  normal  breast  tissue.  High  DCD  expression  is  detected  in  only  two  tumors  (1 5  and  236).  The  blot  was  rehybridized  with  0-actin 
to  indicate  equal  loading.  (B)  RT-PCR  analyses  of  breast  cancer  cell  lines  using  DCD-  and  0-actin-specific  primers.  (0  mRNA  in  situ  hybridization  using 
digitonin-labeled  DCD  riboprobes  on  tissue  sections  (tumor  and  normal  236,  red  staining;  tumors  15  and  238,  black  staining).  Adjacent  section  stained  with 
hematoxylin/eosin.  (0)  DCD  immunostaining  of  normal  breast  tissue,  ductal  carcinoma  in  situ,  a  DCD-positive  invasive  breast  carcinoma  (IDC),  and  sweat  gland 
of  the  skin.  (£)  Immunoblot  analysis  of  human  sweat  and  cells  transfected  with  empty  or  DCD-expressing  vector.  An  «11-kDa  protein  is  detected  in  both 
transfected  cells  and  sweat. 


Porter  et  al. 


PNAS  |  September  16, 2003  |  vol.  100  |  no.  19  |  10933 


12<?24*l 
12^24*2 

12^24*3 
B  Normal 


wm 

ST7| 

m 

I7?1 

wm 

mm 

m 

m 

VSi 

wm 

wm 

frai 

wm 

wm 

m 

m 

m 

■FI 

wnm 

nnG&fSNi 

wmwmsm 

wn 

n 

wm 

Kl 

WM 

El 

mm 

wm 

wm 

Kl 

SB 

m 

El 

w s 

mm 

mm 

wm 

HI 

VI 

Kl 

mw 

Bk 

Kl 

Kl 

KS 

IBS 

wm 

HI 

HI 

sns 

Kl 

Kl 

El 

SB 

■EH 

'jonriiiWi 

■n 

mu 

wm 

wm 

mm 

Kl 

Kl 

KS 

mm 

kh 

wm 

11 

HM 

Kl 

171 

m 

SB 

■n 

hi 

ki 

wm 

wm 

Kl 

mnm 

Kl 

KS 

Kl 

hi 

Kl 

■ns 

wm 

SB 

Kl 

Kl 

ill 

SB 

SB 

mm 

wm 

wm 

Kl 

Kl 

Kl 

Kl 

KS 

n 

im 

ETO 

IFTl 

HI 

Kl 

Kl 

Kl 

■79 

isns 

ESS 

wm 

m 

in 

wm 

171 

m 

KS 

mm 

wm 

wm 

■n 

HI 

El 

El 

si 

■71 

SEMI 

mn£Sv9MI 

hi 

wm 

ki 

Kl 

HI 

mm 

KS 

KS 

KS 

mm 

wm 

El 

HI 

SB 

Kl 

il 

Kl 

KS 

■ns 

Ntata  iLfJutiKK 

mm 

wm 

vs 

Kl 

in 

■Til 

ITS 

m 

m 

n 

■71 

wm 

wm 

■71 

It 1 

« 

It 1 

SB 

sw 

£jjf£j^HSS 

El 

Kl 

Kl 

Kl 

KS 

KS 

wm 

wm 

KK 

Kl 

Kl 

Kl 

m 

Kl 

HI 

■ESI 

HI 

mm 

KI 

n 

a 

El 

KS 

m 

KS 

n 

wm 

■ns 

Kl 

SB 

Kl 

El 

Kl 

SFl 

■S 

giroami 

Hm 

mm 

Kl 

Kl 

■n 

El 

wm 

KS 

El 

wm 

wm 

HI 

KS 

SB 

KS 

KS 

Kl 

SB 

IS 

■FI 

mm 

KS 

Kl 

Kl 

sm 

KS 

Kl 

KS 

HI 

SFl 

mm 

KS 

SB 

Kl 

Kl 

■3 

HI 

KS 

1 

i 

mm 

wm 

Kl 

Kl 

mm 

KS 

KS 

Kl 

WTW 

Kl 

■ns 

SB 

El 

■1 

Kl 

HI 

c 

075 


Tumor  12 


.«  *  .  ~  A  / 

S\.  ,  .  A 

yvr vvv  (j 

/  '  ”  \  / 

^  V  V 

DCD 


PCD  12CEN 


CI>K4  PCD 


Fig.  3.  Gain  of  DCD  locus  in  breast  carcinomas.  (A)  Idiogram  of  human  chromosome  1 2  and  the  expression  of  genes  adjacent  to  DCD  in  SAGE  libraries  generated 
from  normal  (Nland  N2)  breast  tissue,  and  in  situ  (D1-8),  invasive  (11-6),  and  metastatic  (LN1-2  and  Ml)  breast  carcinomas.  Genes  closest  to  DCD  (highlighted 
with  yellow  color),  lacritin  (LACRT),  and,  to  a  lesser  extent,  a  phosphatase  subunit  (PPP1R1A)  are  expressed  only  in  the  three  tumors  with  high  levels  of  DCD, 
suggesting  possible  amplification  of  this  chromosomal  area.  No  other  genes  near  DCD  appear  to  be  overexpressed  in  these  breast  tumors.  (B)  FISH  analysis  of 
DCD  to  normal  metaphase  chromosomes  shows  hybridization  at  1 2q  1 3  on  both  copies  of  chromosome  12  {Left).  Hybridization  of  DCD  (red)  and  an  alpha-satellite 
probe  to  the  centromere  of  chromosome  12  (green)  reveals  amplification  of  DCD  and  disomy  of  chromosome  12  in  tumor  12  interphase  cells  (Center).  Analysis 
of  DCD  (green)  and  CDK4  (red)  reveals  coamplification  in  tumor  12  interphase  cells  (Right).  (0  A  representative  BAC  array  CGH  profile  demonstrating  a  gain  of 
the  DCD  locus  (arrow). 


transfected  cells  and  human  sweat.  Correlating  with  its  predicted 
molecular  weight,  the  exogenously  expressed  recombinant  DCD 
protein  migrates  as  a  single  ~ll-kDa  protein,  and  a  protein  of 
approximately  the  same  size  is  also  detected  in  sweat  (Fig.  2 E). 
The  slightly  higher  and  lower  molecular  weight  proteins  recog¬ 
nized  with  our  DCD  antibody  in  the  sweat  may  correspond  to 
posttranslationally  modified  or  partially  proteolyzed  DCD  (Fig. 
2 E).  These  results  confirm  that  a  full-length  DCD  protein  is 
expressed  and  secreted  in  vivo. 

Focal  Copy  Number  Gain  of  the  DCD  Locus  in  Breast  Carcinomas. 

Based  on  the  human  genome  sequence,  DCD  was  localized  to 
chromosome  12  in  band  ql3.1,  which  we  confirmed  by  FISH 
(Fig.  3 B).  Examination  of  the  expression  of  all  known  and 
predicted  genes  in  the  vicinity  (±5  megabases)  of  DCD  deter¬ 
mined  that  two  genes  localized  next  to  DCD  were  expressed  only 
in  the  same  three  breast  carcinomas  that  expressed  DCD  and 
were  not  detected  in  any  of  the  other  >100  SAGE  libraries  (Fig. 
3 A  and  data  not  shown).  This  suggested  that  the  overexpression 
of  DCD  in  breast  tumors  may  be  due  to  gene  amplification.  To 
determine  whether  the  DCD  locus  is  amplified  in  the  DCD- 
overexpressing  tumors  we  performed  FISH  and  detected  mod¬ 
erate  levels  of  DCD  amplification  in  tumor  12  (Fig.  3 B).  We  also 
analyzed  several  known  oncogenes  ( CDK4 ,  SAS ,  GUI ,  and 
MDM2)  localized  to  12q  and  detected  only  CDK4  amplification 
in  tumor  12  (Fig.  3 B  and  data  not  shown).  In  three  other  tumors 
(II,  LN1,  and  236)  overexpressing  DCD  the  FISH  pattern  was 
consistent  with  three  to  five  copies  of  DCD  and  all  of  the  other 
genomic  regions  tested,  suggesting  that  a  large  part  of  chromo¬ 
some  12  was  gained  (data  not  shown).  However,  based  on  SAGE, 
these  oncogenes  (MDM2,  CDK4 ,  SAS ,  etc.)  were  not  overex¬ 
pressed  in  DCD-positive  breast  tumors  (Fig.  3A).  To  establish 
how  frequently  a  gain  of  the  DCD  locus  is  detected  in  breast 
tumors,  we  analyzed  an  independent  set  of  152  breast  tumors  by 


using  BAC  array  CGH  and  found  a  significant  focal  copy  number 
increase  of  the  DCD  locus  in  20  tumors  (Fig.  3C). 

To  further  investigate  the  association  between  gain  of  the 
DCD  locus  and  the  overexpression  of  the  DCD  protein,  we 
performed  immunohistochemical  analysis  on  eight  breast  tu¬ 
mors  that  showed  a  12ql3.1  focal  copy  number  increase  of  the 
DCD  locus  based  on  BAC  array  CGH.  Five  of  these  eight  tumors 
expressed  the  DCD  protein,  which  is  a  much  higher  fraction  than 
expected  ( P  =  0.0003)  based  on  the  frequency  of  DCD  positivity 
in  unselected  tumors  (48/558).  Thus,  this  result  further  suggests 
that  at  least  in  some  cases  DCD  overexpression  in  breast  tumors 
is  due  to  a  gain  of  the  12ql3.1  chromosomal  area. 

DCD  Is  a  Growth  and  Survival  Factor  for  Breast  Cancer  Cells.  To 

analyze  the  effect  of  DCD  overexpression  on  breast  cancer  cell 
growth  and  survival  we  established  derivatives  of  the  21NT 
breast  cancer  cell  line,  chosen  based  on  its  features  resembling 
DCD-expressing  primary  breast  tumors,  that  stably  overex¬ 
pressed  DCD.  Next  we  compared  the  growth  of  pools  of  control, 
empty  vector  transfected  cells  with  that  of  cells  expressing  DCD 
and  found  that  DCD-expressing  21NT  cells  grew  significantly 
faster  than  controls,  especially  in  reduced  serum-containing 
medium  (Fig.  4 A).  Similar  results  were  obtained  in  DCD- 
expressing  VA13-transformed  fibroblasts  and  C2C12  myoblasts, 
whereas  preliminary  data  suggest  that  DCD  has  no  effect  on 
immortalized  mammary  epithelial  cells  (data  not  shown). 

To  determine  the  effect  of  DCD  expression  on  cell  survival 
after  oxidative  stress,  we  treated  control  and  DCD-expressing 
21NT  cells  with  varying  concentrations  of  menadione,  a  potent 
inducer  of  mitochondrial  reactive  oxygen  species  (ROS)  pro¬ 
duction.  As  depicted  in  Fig.  4 B  Left ,  DCD-expressing  cells  were 
significantly  more  resistant  to  menadione-induced  cell  death 
than  control  21NT  cells.  To  establish  whether  the  DCD- 
mediated  protection  from  ROS-induced  cell  death  is  important 
in  a  more  physiologic  oxidative  stress-inducing  condition,  we 
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Fig.  4.  DCD  function  and  receptors.  (>A)  Growth  curves  of  control  (pBabe)  and  DCD-expressing  (pBabe-DCD)  21  NT  breast  cancer  cells  in  high  (5%)  or  low  (0.5%) 
serum-containing  media.  DCD-expressing  cells  grew  significantly  faster  in  both  conditions.  A  representative  experiment  is  shown.  (B)  Survival  data  showing 
reduced  susceptibility  of  21  NT  cells  expressing  DCD  to  cell  death  induced  by  menadione  or  glucose  deprivation.  Data  are  the  mean  of  three  experiments  with 
three  determinations  each.  (0  In  situ  staining  for  DCD  receptor  in  breast  and  brain  tissue.  Sections  of  breast  tumors,  normal  breast  tissue,  and  brain  were 
incubated  with  AP  control  or  AP-DCD  fusion  protein.  Purple  staining  detects  the  presence  of  a  putative  DCD  receptor.  Faint  brownish  coloring  of  neurons  of  the 
locus  ceruleus  and  substantia  nigra  in  the  control  AP  sections  is  due  to  natural  pigment  (melanin)  present  in  these  cells.  (D)  Scatchard  transformation  of  binding 
analysis  of  AP-DCD  to  21  NT  breast  cancer  cells.  (Inset)  The  actual  binding  curve.  (£)  Growth  curves  of  cells  treated  with  purified  AP-DCD. 

analyzed  the  effect  of  glucose  deprivation  on  control  21NT  and  putative  DCD-receptor  interaction,  we  performed  more  de- 

DCD-expressing  cells.  Cancer  cells  are  known  to  be  particularly  tailed  binding  assays  in  21NT  breast  cancer  cells.  Scatchard  plot 

sensitive  to  the  withdrawal  of  glucose  that  leads  to  increased  analysis  showed  two  binding  slopes  in  21NT  cells  (Fig.  AD):  one 

mitochondrial  ROS  production  and  subsequent  cell  death  (23).  with  a  moderately  high  affinity  (i£d  =  1.5  X  10~8  M)  and  another 

Similar  to  the  results  obtained  with  menadione,  DCD-expressing  with  much  lower  affinity  (Kd  =  2.1  x  10  7  M).  Further  proving 

21NT  cells  survived  growth  in  glucose-free  medium  significantly  that  DCD’s  effect  is  mediated  through  a  cell  surface  receptor  and 

better  than  control  cells,  with  the  most  pronounced  difference  that  the  AP-DCD  fusion  protein  is  a  functional  ligand  for  the 

seen  in  low-serum-containing  medium  (Fig.  AB  Right).  putative  DCD  receptor,  21NT  cells  incubated  in  conditioned 

medium  containing  AP-DCD,  or  treated  with  purified  AP- 
Cell  Surface  DCD  Binding.  The  DCD  protein  is  predicted  to  be  DCD,  grew  faster  than  controls  (Fig.  AE  and  data  not  shown), 
secreted,  suggesting  that  DCD  is  likely  to  execute  its  function 
through  binding  to  a  cell  surface  receptor.  To  determine  whether  Discussion 

there  is  a  DCD-binding  cell  surface  protein(s),  we  generated  an  Based  on  SAGE  analysis  of  breast  tumors  of  different  clinical 

AP-DCD  fusion  protein  to  be  used  as  a  ligand  in  receptor  stages  we  identified  DCD,  a  novel  growth  and  survival  factor  for 

binding  assays  (16).  Conditioned  medium  containing  AP-DCD  breast  cancer  cells.  DCD  encodes  a  secreted  protein  with  limited 

or  control  AP  was  used  to  stain  normal  and  cancerous  mammary  homology  to  lacritin  and  an  EST.  Lacritin  is  a  secretion- 

tissue  sections.  Intense  purple  staining  indicated  the  presence  of  enhancing  and  growth-promoting  factor  recently  identified  from 

a  DCD-binding  protein  in  tumor  236,  but  not  in  normal  mam-  lacrimal  gland  (24).  The  EST  is  expressed  in  the  cerebral  cortex, 

mary  epithelial  and  stromal  cells,  whereas  low-intensity  staining  and  it  encodes  an  uncharacterized  protein  containing  a  repeti- 

was  observed  in  tumor  19  (Fig.  4C).  These  results  suggested  the  tive  sequence  (ETPA)  found  in  several  secreted  proteins.  In 

presence  of  a  cell  surface  DCD-binding  protein(s)  in  cancerous,  addition,  two  small  proteolytic  peptides  identified  as  a  cancer 

but  not  normal,  mammary  epithelial  cells,  and  are  consistent  cachexia  factor  and  a  neural  survival-promoting  peptide,  respec- 

with  an  autocrine  and/or  paracrine  mechanism  of  DCD  action.  tively,  were  likely  to  be  derived  from  DCD  (20,  21).  The 

Because  of  its  expression  pattern  in  normal  human  brain,  we  cachexia-  and  proteolysis-inducing  factor  was  identified  as  a 

also  tested  whether  neurons  bind  DCD  (Fig.  4C).  Weak  DCD  24-kDa  glycoprotein  produced  by  the  cachexia-inducing  MAC 

binding  to  almost  all  neurons  was  seen  in  human  adult  brain  16  murine  colon  adenocarcinoma  in  mice  and  was  shown  to  be 

(data  not  shown),  whereas  the  strongest  DCD  binding  was  present  in  the  urine  of  cachectic  cancer  patients  (25-27).  The 

detected  in  neurons  of  the  locus  ceruleus,  nucleus  raphe  pontis,  neural  survival-promoting  peptide  was  identified  from  the  media 
substantia  nigra,  and  the  lateral  hypothalamic  nuclei  (Fig.  4C).  of  mouse  HN33.1  hippocampal  neurons  and  human  Y79  reti- 
To  further  test  the  binding  characteristics  of  AP-DCD,  we  noblasts  treated  with  hydrogen  peroxide  and  was  shown  to 

performed  in  vitro  ligand  binding  assays  using  various  cell  lines.  enhance  neural  survival  after  an  oxidative  insult  (21,  28). 

Low-level  AP-DCD  binding  was  detected  in  all  cell  lines  tested,  Based  on  FISH  and  array  CGH  analysis  we  determined  that 
with  stronger  binding  observed  in  human  21NT  breast  cancer  the  overexpression  of  DCD  in  breast  tumors  is  due  to  the  focal 

cells  (data  not  shown).  To  further  characterize  the  AP-DCD-  copy  number  increase  of  the  DCD  locus.  The  low  level  of  gain 
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observed  in  array  CGH  could  be  due  to  the  fact  that  the  breast 
tumors  used  for  this  analysis  were  not  microdissected;  thus, 
contaminating  stromal  cells  with  two  copies  of  the  DCD  locus 
may  decrease  the  hybridization  signal.  In  addition,  as  depicted  in 
Figs.  2 C  and  3 B,  the  expression  and  gain  of  DCD  are  hetero¬ 
geneous  in  most  breast  tumors,  with  only  a  fraction  of  tumor  cells 
being  positive;  thus,  even  a  significant  copy  number  gain  may  be 
detected  only  as  a  low  level  gain  when  the  tissue  is  analyzed  in 
bulk  by  using  array  CGH.  Correlating  with  this,  cDNA  array 
CGH  analysis  of  the  tumors  with  significant  DCD  gain  based  on 
FISH  revealed  no  significant  copy  number  increase  (data  not 
shown).  The  minimum  region  of  chromosomal  gain  based  on 
BAC  aCGH  is  4.2  megabases,  because  the  two  flanking  BACs 
that  do  not  show  gain  are  that  far  apart.  This  region  encompasses 
DCD ,  CDK4 ,  and  SAS.  However,  based  on  SAGE,  we  did  not  see 
overexpression  of  any  of  these  oncogenes  (MDM2,  CDK4 ,  SAS , 
etc.)  in  DCD-positive  breast  tumors  (Fig.  3 A). 

Consistent  with  being  a  putative  oncogene,  the  overexpression 
of  DCD  in  breast  cancer  cells  enhanced  cell  growth  and  survival 
and  reduced  serum  dependency.  Because  in  the  cell  survival 
experiments  performed  with  menadione  cell  viability  was  de¬ 
termined  24  h  after  plating  the  cells,  the  observed  difference 
seen  in  live  cell  numbers  is  unlikely  to  be  the  effect  of  DCD  on 
cell  growth.  Conversely,  the  effect  of  DCD  on  cell  growth  cannot 
fully  be  explained  by  its  ability  to  protect  against  ROS  generated 
because  of  culturing  the  cells  under  supraphysiologic  (21% 
atmospheric)  oxygen  concentrations,  as  a  similar  effect  was  seen 
in  cells  grown  at  physiologic  (3%)  oxygen  concentration  (data 
not  shown).  Thus,  the  growth  and  survival-promoting  effects  of 
DCD  appear  to  be  distinct,  although  to  conclusively  prove  this 
will  require  the  detailed  characterization  of  the  DCD-signaling 
pathway.  Correlating  with  the  observed  in  vitro  effect  of  DCD  on 
breast  cancer  cell  growth  and  survival,  DCD-expressing  primary 
breast  tumors  were  larger  and  more  likely  to  have  metastatic 
lymph  nodes.  Based  on  these  results,  it  is  likely  that  the  over¬ 
expression  and  copy  number  gain  of  DCD  confer  a  selective 
advantage  for  breast  tumor  cells. 

Based  on  in  vivo  ligand  binding  studies  performed  with  an 
AP-DCD  fusion  protein,  we  detected  strong  cell  surface  DCD 
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binding  to  breast  cancer  cells  and  neurons  of  the  brain.  Inter¬ 
estingly,  catecholaminergic  (noradrenergic  and  dopaminergic) 
neurons  of  the  brain  that  strongly  bound  AP-DCD  are  partic¬ 
ularly  susceptible  to  oxidative  stress  because  the  biosynthesis  of 
these  neurotransmitters  from  tyrosine  requires  molecular  oxy¬ 
gen.  Moreover,  the  autooxidization  of  catecholamines,  the  end 
product  of  which  is  melanin  that  accumulates  in  neurons  of  the 
substantia  nigra  and  locus  ceruleus,  leads  to  the  generation  of 
ROS  (H202,  02  ,  and  OH-).  The  strong  binding  of  DCD  to  these 
neurons  is  consistent  with  the  roles  of  a  30-aa  neural  survival 
factor  (21)  and  a  cachexia  factor  possibly  derived  from  DCD 
(20).  Definition  of  the  relationships  among  these  peptides 
requires  further  studies. 

In  summary,  DCD  is  a  novel  growth  and  survival  factor  that 
is  overexpressed  in  ^10%  of  primary  invasive  breast  carcinomas, 
and  its  overexpression,  at  least  in  some  cases,  is  associated  with 
a  gain  of  its  locus  at  12ql3.1.  Based  on  its  function  and  restricted 
expression  pattern  in  normal  adult  tissues,  DCD  is  a  candidate 
cancer  therapeutic  target.  The  secreted  nature  and  extracellular 
mechanism  of  DCD  action  make  it  even  more  attractive  for  such 
a  purpose. 

Neurons  are  particularly  sensitive  to  ROS,  whereas  tumor  cells 
themselves  produce  large  amounts  of  ROS  (29).  Therefore,  the 
high  expression  of  DCD  in  these  cell  types  may  be  essential  for 
their  survival.  Thus,  therapeutic  activation  of  the  DCD-signaling 
pathway  may  be  beneficial  in  certain  neurodegenerative  diseases 
involving  catecholaminergic  neurons  such  as  Parkinson's  dis¬ 
ease,  and  its  therapeutic  inhibition  may  be  an  effective  treatment 
of  tumors  with  DCD  expression. 
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Summary 

Here  we  describe  the  comprehensive  gene  expression  profiles  of  each  cell  type  composing  normal  breast  tissue  and  in 
situ  and  invasive  breast  carcinomas  using  serial  analysis  of  gene  expression.  Based  on  these  data,  we  determined  that 
extensive  gene  expression  changes  occur  in  all  cell  types  during  cancer  progression  and  that  a  significant  fraction  of 
altered  genes  encode  secreted  proteins  and  receptors.  Despite  the  dramatic  gene  expression  changes  in  all  cell  types, 
genetic  alterations  were  detected  only  in  cancer  epithelial  cells.  The  CXCL14  and  CXCL12  chemokines  overexpressed  in 
tumor  myoepithelial  cells  and  myofibroblasts,  respectively,  bind  to  receptors  on  epithelial  cells  and  enhance  their  prolifera¬ 
tion,  migration,  and  invasion.  Thus,  chemokines  may  play  a  role  in  breast  tumorigenesis  by  acting  as  paracrine  factors. 


Introduction 

Breast  cancer  is  the  most  commonly  identified  and  one  of  the 
deadliest  neoplasms  in  women  in  Western  countries.  The  recent 
trend  toward  improvement  in  breast  cancer  mortality  rate  is 
largely  due  to  increased  diagnosis  of  early  stage  disease,  while 
our  therapeutic  options  for  advanced  stage  breast  carcinomas 
are  still  fairly  limited.  Thus,  there  is  a  need  to  better  understand 
the  molecular  basis  of  breast  cancer  initiation  and  progression 
and  to  use  this  knowledge  for  the  design  of  targeted,  molecular- 
based  therapies.  In  the  past  few  years,  newly  developed  tech¬ 
nologies  such  as  microarrays  and  SAGE  (serial  analysis  of  gene 
expression)  have  enabled  us  to  analyze  molecular  differences 
between  normal  and  cancer  cells  at  a  genome-wide  level  in 
comprehensive  and  unbiased  ways  (Schena  et  al.,  1995;  Vel- 


culescu  et  al.,  1995).  Using  these  approaches,  the  molecular- 
based  classification  of  breast  cancer  has  become  a  reality,  and 
molecular  signatures  correlating  with  metastatic  behavior  and 
clinical  outcome  have  been  identified  (Ramaswamy  et  al.,  2003; 
Sorlie  et  al.,  2001 ;  van  ’t  Veer  et  al.,  2002;  van  de  Vijver  et  al., 
2002).  However,  since  most  of  these  analyses  were  performed 
using  bulk  tissue  samples  that  are  composed  of  multiple  cell 
types  or  purified  tumor  epithelial  cells,  the  specific  contribution 
of  epithelial  and  stromal  cells  to  these  tumor  classifiers  and 
prognostic  signatures  is  unknown.  Similarly,  in  the  past  decades 
the  major  focus  of  cancer  research  has  been  the  transformed 
tumor  cell  itself,  while  the  role  of  the  cellular  microenvironment 
in  tumorigenesis  has  not  been  widely  explored.  Early  studies 
demonstrated  the  ability  of  stromal  tissues  to  regulate  the 
growth  and  differentiation  state  of  breast  cancer  cells  (DeCosse 


SIGNIFICANCE 

Despite  compelling  cell  biological  studies  and  histopathologicai  observations  incriminating  myoepithelial  and  stromal  cells  In  tumori¬ 
genesis,  our  knowledge  of  the  genes  that  mediate  changes  in  the  tumor  microenvironment  and  interactions  among  various  cell 
types  in  breast  cancer  and  their  role  in  tumorigenesis  is  limited.  Similarly,  the  occurrence  and  role  of  genetic  changes  In  stromal 
cells  are  undefined.  Here,  we  describe  a  comprehensive  molecular  characterization  of  each  cell  type  composing  normal  breast  tissue 
and  in  situ  and  invasive  breast  carcinomas.  We  identified  several  genes  as  potential  mediators  of  epithelial-stromal/myoepithelial  cell 
interactions,  including  the  CXCL12  and  CXCL14  chemokines.  These  data  should  therefore  provide  a  valuable  resource  for  future 
basic  and  clinical  studies  addressing  the  role  of  epithelial-stromal/myoepithelial  cell  interactions  in  breast  cancer. 
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et  a!.,  1 973, 1 975),  and  several  recent  in  vivo  and  in  vitro  studies 
have  demonstrated  that  the  growth,  differentiation,  invasive  be¬ 
havior,  and  polarity  of  normal  mammary  epithelial  cells  and 
breast  carcinomas  are  influenced  by  surrounding  stromal  cells 
including  fibroblasts,  myofibroblasts,  leukocytes,  and  myoepi¬ 
thelial  cells  (Bissell  and  Radisky,  2001;  Radisky  et  aL,  2001; 
Tlsty,  2001).  In  addition,  certain  histopathological  features  of 
breast  tumors,  including  lymphocytic  infiltration,  fibrosis,  and 
angio-  and  lymphangiogenesis,  have  proven  prognostic  signifi¬ 
cance.  Despite  these  convincing  data  implicating  a  role  for  the 
tumor  microenvironment  in  breast  tumorigenesis,  our  under¬ 
standing  of  the  genes  mediating  cellular  interactions  and  para¬ 
crine  regulatory  circuits  among  various  cell  types  in  normal  and 
cancerous  breast  tissue  and  their  role  in  breast  tumorigenesis 
is  limited. 

In  the  past  few  years,  the  role  of  the  cellular  microenviron¬ 
ment  in  tumorigenesis  has  become  an  intense  area  of  research. 
This  is  in  part  due  to  studies  demonstrating  that  genetic  abnor¬ 
malities,  such  as  loss  of  heterozygosity  (LOH),  occur  not  only 
in  cancer  cells,  but  in  stromal  cells  as  well  (Kurose  et  aL,  2001 , 
2002;  Lakhani  et  al.,  1998;  Moinfar  et  aL,  2000).  However,  no 
genes  presumably  targeted  by  these  genetic  events  in  stromal 
cells  have  been  identified;  thus,  their  role  in  breast  tumorigenesis 
is  still  unknown. 

As  a  consequence  of  studies  focusing  almost  exclusively  on 
cancer  cells,  nearly  all  of  the  currently  used  cancer  therapeutic 
agents  target  the  cancer  cells  that,  due  to  their  inherent  genomic 
instability,  frequently  acquire  therapeutic  resistance  (Rajagopa- 
lan  et  al.,  2003).  In  part  due  to  frequent  therapeutic  failures 
during  the  course  of  treatment  of  advanced  stage  tumors,  in¬ 
creasing  emphasis  has  been  placed  on  targeting  various  stromal 
cells,  particularly  endothelial  cells,  via  therapeutic  interventions. 
Since  these  cells  are  thought  to  be  normal  and  genetically  stable, 
they  are  less  likely  to  develop  acquired  resistance  to  cancer 
therapy.  Thus,  isolating  and  characterizing  each  cell  type  (epi¬ 
thelial,  myoepithelial,  and  various  stromal  cells)  comprising  non- 
malignant  and  cancerous  breast  tissue  would  not  only  help  us 
to  understand  the  role  these  cells  play  in  breast  tumorigenesis, 
but  would  likely  give  us  new  molecular  targets  for  cancer  inter¬ 
vention  and  treatment. 

Results 

Purification  of  ail  cell  types  present  in  breast  tissue 

To  determine  the  molecular  profile  of  each  cell  type  that,  to¬ 
gether,  compose  the  breast  tissue  and  to  identify  autocrine 
and  paracrine  interactions  that  may  play  a  role  in  breast  tumor 
progression,  we  developed  a  purification  procedure  that  allows 
the  isolation  of  pure  cell  populations  from  normal  breast  tissue 
and  from  in  situ  (ductal  carcinoma  in  situ,  DCIS)  and  invasive 
breast  carcinomas  (Figure  1  A).  We  utilized  cell  type-specific  cell 
surface  markers  and  magnetic  beads  for  the  rapid  sequential 
isolation  of  the  various  cell  types.  We  used  the  BerEP4  antigen 
restricted  to  epithelial  cells,  the  CD45  panleukocyte  marker, 
and  the  PI  H 1 2  antibody  that  specifically  recognizes  endothelial 
cells.  The  CD10  antigen  is  present  in  myoepithelial  cells  and 
myofibroblasts,  but  also  in  some  leukocytes.  Thus,  to  minimize 
the  crosscontamination  of  these  different  cell  types,  in  the  case 
of  normal  (N-MYOEP-1)  and  DCIS  breast  tissue,  myoepithelial 
cells  were  isolated  from  organoids  (breast  ducts),  while  in  inva¬ 
sive  tumors  we  first  removed  the  leukocytes  prior  to  capturing 


the  myofibroblasts  using  CD10  beads.  Several  recent  studies 
reported  that  some  morphologically  distinct  myoepithelial  cells 
lack  CD10  and  other  myoepithelial  cell  markers  (Zhang  et  al., 
2003).  Thus,  due  to  the  use  of  CD10  beads  for  the  isolation  of 
myoepithelial  cells,  a  subset  of  myoepithelia!  cells  may  have 
been  excluded  from  our  study.  We  were  not  able  to  identify  an 
antibody  that  would  specifically  recognize  fibroblasts  and  allow 
their  purification;  thus,  we  used  the  unbound  fraction  following 
the  removal  of  all  other  cell  types  as  a  fibroblast-enriched  “stro¬ 
mal”  fraction.  A  detailed  description  of  the  purification  method  is 
described  in  the  Supplemental  Data  (http://www.cancerceli.org/ 
cgi/content/full/6/1/HB*/DC1).  Since  this  protocol  includes  se¬ 
quential  enzymatic  digestion  of  the  tissue,  the  possibility  that  the 
expression  of  some  genes  could  be  altered  due  to  the  procedure 
cannot  be  excluded.  However,  since  we  were  able  to  verify  the 
SAGE  data  by  alternative  methods  using  unprocessed  tissue 
(Figure  3),  these  changes  (if  any)  are  likely  to  be  minimal.  The 
success  of  the  purification  method  and  the  purity  of  each  cell 
fraction  were  confirmed  by  performing  RT-PCR  on  a  small  frac¬ 
tion  of  the  isolated  cells  using  cell  type-specific  genes  (Figure 
1 B).  The  remaining  portion  of  the  cells  (M  0,000-1 00,000  cells, 
depending  on  the  sample)  was  used  for  the  generation  of  micro- 
SAGE  libraries  following  previously  described  protocols  (Porter 
et  al.,  2001 , 2003a)  and  for  the  isolation  of  genomic  DNA  to  be 
used  for  array  comparative  genomic  hybridization  (aCGH)  and 
single  nucleotide  polymorphism  (SNP)  array  studies.  We  have 
generated  SAGE  libraries  from  epithelial  and  myoepithelial  cells 
(myofibroblasts  from  invasive  tumors),  infiltrating  lymphocytes, 
endothelial  cells,  and  fibroblasts  (stroma)  from  one  normal 
breast  reduction  tissue,  two  different  DCIS,  and  three  invasive 
breast  tumors.  Not  all  libraries  were  generated  from  all  cases 
due  to  our  inability  to  obtain  sufficient  amounts  of  purified  cells. 
In  addition,  we  also  included  a  fibroadenoma  and  a  phyllodes 
tumor  in  our  SAGE  analyses.  Fibroadenomas  are  the  most  com¬ 
mon  benign  breast  tumors  that  are  not  considered  to  progress 
to  malignancy  despite  genetic  changes  detected  in  the  stromal 
(but  not  epithelial)  cells  (Amiel  et  al.,  2003).  Phyllodes  tumors, 
on  the  other  hand,  are  rare  fibroepithelial  tumors  that  are  usually 
benign  but  can  recur  and  progress  to  malignant  sarcomas.  Ini¬ 
tially,  phyllodes  tumors  were  considered  stromal  neoplasms,  but 
recent  molecular  studies  demonstrating  (frequently  discordant) 
genetic  alterations  in  both  epithelial  and  stromal  cells  suggests 
that  phyllodes  tumors  may  represent  a  true  clonal  co-evolution 
of  malignant  epithelial  and  stromal  cells  (Sawyer  et  al.,  2000, 
2002).  A  detailed  description  of  the  tissue  samples  and  the 
SAGE  libraries  is  included  in  the  Supplemental  Data  online. 
Analysis  of  the  SAGE  data  confirmed  that  the  cell  purification 
procedure  worked  well,  since  several  genes  known  to  be  spe¬ 
cific  for  a  particular  cell  type  were  present  in  the  appropriate 
SAGE  libraries.  For  example,  cytokeratins  8  and  19,  E-cadherin, 
HIN-1 ,  and  CD24  were  highly  specific  for  epithelial  cells  (HIN-1 
only  for  normal  epithelial  cells);  myofibroblast  and  myoepithelial 
cells  demonstrated  high  levels  of  smooth  muscle  actin,  various 
extracellular  matrix  proteins  including  collagens,  and  matrix 
metalloproteinases;  and  leukocyte  libraries  had  the  highest  lev¬ 
els  of  several  chemokines  and  lysozyme  (Table  1  and  Supple¬ 
mental  Table  SI).  In  general,  SAGE  libraries  prepared  from  the 
same  cell  type  purified  from  different  tissue  samples  were  highly 
similar  to  each  other,  although  there  were  differences  as  well, 
likely  due  to  variability  among  patients  and  also  slight  variability 
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Figure  1.  Isolation  and  characterization  of  each  cell  type  comprising  normal  and  cancerous  breast  tissue 

A:  Schematic  outline  of  tissue  fractionation  and  sequential  purification  of  the  various  cell  types  from  normal  breast  tissue  and  in  situ  and  invasive  breast 
carcinomas.  The  procedure  is  described  in  detail  in  the  online  Supplemental  Data. 

B:  RT-PCR  analysis  of  each  cell  fraction  isolated  from  DCIS-7  using  known  cell  type-specific  genes  to  confirm  the  purity  of  the  cells  and  integrity  of  the 
mRNA.  MME  (CD10)  is  highly  specifically  expressed  in  CD10+  myoepithelial  cells  and  myofibroblasts.  PTPRC  (CD45)  in  leukocytes,  and  CDH5  (endothelial 
cadherin)  in  endothelial  cells.  Although  ERBB2  is  not  an  absolutely  epithelial  cell-specific  gene,  its  abundance  is  highest  in  luminal  epithelial  cells.  PCR  was 
performed  at  25,  30,  and  35  cycles.  Genes  expressed  at  equal  levels  in  all  cell  types;  p-actin  (ACTB)  and  ribosomal  protein  LI  9  (RPL19)  were  used  as 
controls. 

C:  Heat  map  depicting  the  relatedness  of  the  different  SAGE  libraries  based  on  41 7  cell  type-specific  tags.  Color  scheme:  blue,  downregulated  (low  tag 
counts);  green,  mean  tag  counts;  yellow,  upregulated  (high  tag  counts).  The  names  of  the  SAGE  libraries  prepared  from  epithelial  cells  are  in  red, 
myoepithelial  cells  and  myofibroblasts  in  green,  stroma  in  yellow,  leukocytes  in  blue,  endothelial  cells  in  pink,  and  fibroadenoma  and  phyllodes  tumor 
(stroma  fraction)  in  purple.  A  detailed  description  of  the  SAGE  libraries  and  tissue  samples  is  included  in  Supplemental  Data. 

D:  Heat  map  depicting  the  relatedness  of  the  different  SAGE  libraries  based  on  the  63  most  highly  cell  type-specific  tags.  Color  scheme  and  SAGE  library 
names  are  described  as  above. 


in  the  purification  procedure  itself  (see  Supplemental  Data  for 
more  details). 

Comprehensive  gene  expression  profile 
of  each  cell  type 

Based  on  statistical  methods  developed  for  the  analysis  of 
SAGE  data  (see  Experimental  Procedures,  Supplemental  Data; 
Cai  et  al.,  2004),  we  identified  genes  that  are  specifically  ex¬ 
pressed  in  a  particular  cell  type  and  tumor  progression  stage 


(Tables  1  and  2  and  Supplemental  Tables  SI-SI  5).  Genes  were 
defined  as  specific  for  a  particular  cell  type  if  the  average  tag 
number  in  al!  the  SAGE  libraries  generated  from  the  selected 
cell  type  was  statistically  significantly  (p  <  0.02)  different  from 
all  other  cell  types.  For  the  purpose  of  these  comparisons,  we 
considered  myoepithelial  cells  and  myofibroblasts  as  one  group 
due  to  their  high  degree  of  similarity,  although  there  are  genes 
that  are  specific  for  myoepithelial  cells  and  myofibroblasts,  re¬ 
spectively  (Ronnov-Jessen  et  al.,  1996).  Using  these  criteria, 
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Table  1 .  List  of  63  most  highly  cell  type-specific  tags  and  corresponding  genes 


i 


i 


i 


■ 

l 


n 


III 

III 

III 


III 


lIRIllIRHilll 

iilHIIIIBlBiil 


lllllllll 

lllllllll 

lllllllll 

lllllllll 


llll 

Hill 

Bill 

Hill 


SAGE  tag  numbers  in  the  various  libraries  are  indicated.  Coloring  reflects  tag  abundance  in  the  different  cell  types. 


Table  2.  List  of  genes  encoding  secreted  proteins  and  receptors  overexpressed  in  DCIS  myoepithelial  cells  compared  to  normal  myoepithelium 


sage  Tag 


ACCAAAAACC 

GATCAGGCCA 

TGGAAATGAC 

CGGGGTGGCC 

CTAACGGGGC 

CAGATAAGTT 

CCGGGGGAGC 

GTCAAAATTT 

GTGCTAAGCG 

GACTTTGGAA 

CGCCGACGAT 

TTGGGATGGG 

CATATCATTA 

TCCAGGAAAC 

GGCCCCTCAC 

ACATTCCAAG 

ATAAAAAQAA 

GACCAGCAGA 

ACTTATTATG 

GTGCGCTGAG 

TGCGCTGGCC 

AGGCTCCTGG 

CTCAACCCCC 

CAGCGGCGGG 

GGCACCTCAG 

GCCTGTCCCT 

ATTTCTTCAA 

TCGAAGAACC 

ACATTCTTTT 

CXGTCAGCGT 

CAGCTGGCCA 

ACTGAAAGAA 

TTCTGTGCTG 

GGATGTGAAA 

ACTCAGCCCG 

TTOCCCTCAA 

CTAAAAAAAA 

GGCCACGTAG 

AAGAAAGGAG 

GGAGGAATtC 

AGCCACCGCG 

TGTAAACAAT 

ACCTTGAAGT 

CATAAATGCG 

TTGCTGACTT 

ATGGCAACAG 

CTCTCCAAAC 

TCCCTGCACC 

GGAAATGTCA 

CAflGTTTCAT 

CCGTGACTCT 


is 

s 

0 

•H 

i 

Uni gene 

Gene  description 

>44 

172928 

COLlAl  collagen,  type  I,  alpha  1 

m 

443625 

C0L3A1  collagen,  type  111,  alpha  X 

n 

172928 

COLlAl  collagen,  type  X,  alpha  1 

id 

1584 

COMP  cartilage  oligomeric  matrix  protein 

EM 

513022 

ISLR  immunoglobulin  super family  containing  leucine-rich  repeat 

m 

222171 

KIAA0182  KIAAG1S2  protein 

31 

172928 

COLlAl  collagen,  type  X,  alpha  1 

m 

458354 

THBS2  thrombospondin  2 

m 

420269 

COL6A2  collagen,  type  VI >  alpha  2 

31 

172928 

COLlAl  collagen,  type  I,  alpha  1 

m 

287721 

G1P3  interferon,  alpha -inducible  protein  (clone  IFI-6-16) 

Ed 

296941 

KFLl  H  factor  (complement) -like  1 

EH 

435795 

X&F&P7  Insulln-lika  growth  factor  binding  protein.  7 

Ed 

11590 

CTSF  cathepsin  F 

£01 

274313 

IGF BP 6  insulin- like  growth  factor  binding  protein  6 

S3 

245188 

TXMF3  tissue  inhibitor  of  metalloproteinase  3 

83942 

ctsk  catfaepaio  K 

Ed 

172928 

COLlAl  collagen,  type  I,  alpha  1 

Ed 

156316 

DCN  decorin 

m 

274485 

HLA-C  major  histocompatibility  complex,  class  I*  C 

aa 

289019 

LTBP3  latent  transforming  growth  factor  beta  binding  protein  3 

Ed 

24395 

CXCLX4  chemokino 

m 

162757 

LRP1  low  density  lipoprotein-related  protein  X 

31 

2420 

SOD3  superoxide  dismutase  3,  extracellular 

EH 

512234 

XL6  interleukin  6 

3M 

821 

BGN  biglycan 

31386 

5FRP2  secreted  frizzled-related  protein  2 

ha 

445570 

CD63  CD63  antigen 

an 

389964 

GPNMB  glycoprotein  (transmembrane) 

283713 

C7HECX  collagen  triple  he  lias  repeat  containing  X 

Ed 

445240 

FBLM1  fibulin  1 

Ed 

458355 

CIS  complement  component  1 ,  s  subcomponent 

an 

376414 

C1R  complement  component  1,  r  subcomponent 

Ed 

283477 

CD99  CD99  antigen 

Ed 

101382 

TWFAIP2  tumor  necrosis  factor,  alpha-induced  protein  2 

id 

75111 

PRSS11  protease,  serine,  11  (IGF  binding) 

Ed 

54457 

CD81  C081  antigen  (target  of  antiproliferative  antibody  1) 

Ed 

155597 

DF  D  component  of  complement 

Ed 

202097 

PCOLCE  procollagen  C-endopeptidase  enhancer 

Ed 

418123 

CTSL  cathepsin  L 

39 

355874 

RABL2B  RAB,  member  of  RAS  oncogene  family-like  2B 

39 

170040 

FDGFRL  platelet-derived  growth  factor  receptor-like 

an 

407546 

TNFAIP6  tumor  necrosis  factor,  alpha-induced  protein  6 

an 

436042 

CXCLX2  chemokine  (stromal  cell -derived  factor  1) 

31 

415997 

COL6A1  collagen,  type  VI,  alpha  1 

Ed 

149609 

XTGA5  integrin,  alpha  5 

Ed 

384598 

SERPING1  serine  proteinase  inhibitor,  -clade  G,  member  1 

304682 

C ST 3  cyst at in  C 

El 

367877 

MMP2  matrix  metalloproteinase  2 

dl 

2439$ 

CXCLX4  chemokine 

433622 

FSTL1  foil i&tat in-like  1 

SAGE  tag  numbers  reflect  tag  numbers  normalized  to  the  SAGE  library  with  the  highest  tag  number.  Ratio  was  calculated  as  a  ratio  of  the  average  tag 
numbers  in  the  two  DCIS  myoepithelial  libraries  divided  by  the  tag  numbers  in  the  normal  myoepithelial  library.  Genes  highlighted  in  red  were  selected  for 
follow-up  studies. 
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we  identified  357  tags  that  differentiate  epithelial  cells  from 
other  cell  types,  572  tags  specifying  myoepithelial  cells  and 
myofibroblasts,  502  tags  discriminating  leukocytes,  604  tags 
selecting  stroma,  and  124  tags  discerning  endothelial  cells  from 
other  cells.  To  further  define  SAGE  tags  specific  for  each  cell 
type,  within  each  group  of  tags  we  selected  the  ones  that  were 
not  only  statistically  significantly  different,  but  also  more  abun¬ 
dant  in  the  specific  cell  type.  This  led  to  the  identification  of  70 
tags  that  were  most  abundant  in  epithelial  cells,  1 1 7  tags  present 
at  highest  levels  in  myoepithelial  cells  and  myofibroblasts,  70 
tags  highly  expressed  in  leukocytes,  and  117  stroma-  and  78 
endothelium-specific  tags  (Supplemental  Tables  S3,  S5,  S7, 
S9,  and  Sit).  Several  of  these  genes  have  previously  been 
described  as  being  specific  for  a  particular  cell  type,  such  as 
keratins  8  and  19  for  epithelial  cells,  keratins  14  and  17  for 
myoepithelial  cells,  and  chemokines  and  chemokine  receptors 
for  leukocytes  (Page  et  al.,  1999),  but  the  cell  type-specific 
expression  of  the  majority  of  the  genes  has  not  been  docu¬ 
mented.  The  majority  of  the  transcripts  corresponding  to  these 
cell  type-specific  SAGE  tags  encode  known  genes,  but  a  sig¬ 
nificant  fraction  are  uncharacterized  ESTs  or  currently  have  no 
cDNA  match  (~10%  of  the  tags  on  average  belong  to  each  of 
these  last  two  groups).  The  only  exceptions  were  tags  most 
abundant  in  stroma,  since  in  this  group  25/117  tags  (21  %)  had 
no  database  match,  suggesting  that  they  correspond  to  pre¬ 
viously  unidentified  transcripts. 

Next,  using  the  SAGE  tags  most  abundant  in  (417  tags)  or 
most  highly  specific  for  (63  tags)  each  of  the  five  cell  types,  we 
performed  clustering  analysis  of  all  27  SAGE  libraries  using  a 
new  Poisson  model-based  K-means  algorithm  (PK  algorithm, 
Supplemental  Data;  Cai  et  al.,  2004)  to  delineate  similarities  and 
differences  among  the  samples  (Figures  1 C  and  1 D).  In  addition, 
we  also  performed  clustering  analysis  of  the  SAGE  libraries 
using  each  of  the  cell  type-specific  gene  sets  (Supplemental 
Figures  SI  and  S2).  The  PK  clustering  method  orders  the  sam¬ 
ples  according  to  their  relatedness.  For  example,  using  the  63 
most  highly  cell  type-specific  SAGE  tags,  we  obtained  a  division 
of  the  27  SAGE  libraries  according  to  cell  types,  and  within  each 
cell  type  subgroup,  the  DCIS  samples  were  located  between 
normal  breast  tissue  and  invasive  breast  cancer  SAGE  libraries 
(Figure  ID).  This  result  indicates  that  not  only  tumor  epithelial 
cells,  but  also  other  cell  types  in  the  tumor,  are  different  from 
their  corresponding  normal  counterparts.  Since  these  differ¬ 
ences  are  already  pronounced  at  a  pre-invasive  (DCIS)  tumor 
stage,  they  suggest  a  role  for  stromal  changes  not  only  in  tumor 
invasion  and  metastasis,  but  also  in  the  earlier  steps  of  breast 
tumorigenesis. 

Based  on  our  SAGE  data,  we  found  that  the  most  consistent 
and  dramatic  gene  expression  changes  occur  in  myoepithelial 
cells.  More  than  300  genes  were  differentially  expressed  at  p  < 
0.002  in  both  DCIS  myoepithelial  libraries,  and  interestingly,  a 
significant  fraction  of  these  genes  (89  out  of  245)  encode  se¬ 
creted  or  cell  surface  proteins,  suggesting  extensive  abnormal 
paracrine  interactions  between  myoepithelial  and  other  cell 
types  (Supplemental  Table  S5).  Myoepithelial  cells  are  thought 
to  be  derived  from  bipotential  stem  cells  that  also  give  rise  to 
luminal  epithelial  cells,  although  recently  another  progenitor  has 
been  identified  that  can  differentiate  only  into  myoepithelial  cells 
(Bocker  et  al.,  2002;  Dontu  et  al.,  2003).  The  function  of  myoepi¬ 
thelial  cells  and  their  role  in  breast  cancer  are  not  well  under¬ 
stood,  but  myoepithelial  cells  have  been  shown  to  be  able  to 


suppress  breast  cancer  cell  growth,  invasion,  and  angiogenesis 
(Deugnier  et  al.,  2002;  Sternlicht  and  Barsky,  1997).  The  main 
distinguishing  feature  of  in  situ  and  invasive  carcinomas,  which 
is  also  used  as  a  diagnostic  criteria,  is  that  in  DCIS,  the  cancer 
epithelial  cells  are  separated  from  the  stroma  by  a  nearly  contin¬ 
uous  layer  of  myoepithelial  cells  and  basement  membrane,  while 
in  invasive  and  metastatic  tumors,  cancer  cells  are  admixed 
with  stroma.  Due  to  our  SAGE  and  previously  published  data 
suggesting  a  role  for  these  cells  in  breast  tumor  progression, 
we  focused  our  follow-up  studies  on  myoepithelial  cells  with 
special  emphasis  on  secreted  proteins  and  receptors  abnor¬ 
mally  expressed  in  these  cells.  Several  proteases  (cathepsins 
F,  K,  and  L,  MMP2,  and  PRSS1 1),  protease  inhibitors  (thrombo¬ 
spondin  2,  SERPING1 ,  cystatin  C,  and  TIMP3),  and  many  differ¬ 
ent  collagens  were  highly  upregulated  in  DCIS  myoepithelial 
cells,  suggesting  a  role  for  these  cells  in  extracellular  matrix 
remodeling  (Table  2). 

Analysis  of  the  genotype  of  epithelial,  myoepithelial, 
and  stromal  cells 

To  determine  if  the  dramatic  gene  expression  changes  observed 
in  tumor  myoepithelial  and  stromal  cell  types  could  be  due  to 
underlying  genetic  alterations,  we  first  performed  aCGH  analysis 
of  epithelial  and  myoepithelial  cells  and  of  myofibroblasts  from 
two  DCIS  (DCIS-6  and  -7)  and  one  invasive  breast  carcinoma 
(IDC7)  used  for  SAGE.  As  expected,  we  detected  numerous 
chromosomal  gains  and  losses  in  the  tumor  epithelial  cells, 
while  no  changes  were  detected  in  myoepithelial  cells  and  myo¬ 
fibroblasts  (Figures  2A  and  2B).  Similarly,  no  genetic  changes 
were  detected  in  epithelial  and  myoepithelial  cells  isolated  from 
normal  tissue  adjacent  to  the  tumors  (Figure  2A).  These  data 
suggest  that  although  nonepithelial  cells  in  breast  tumors  are 
phenotypically  distinct  from  their  normal  counterparts,  genetic 
changes  detectable  by  aCGH  appear  to  be  limited  to  cancer 
epithelial  cells.  However,  since  array  CGH  is  thought  to  be  more 
sensitive  for  the  detection  of  copy  number  gains  than  losses 
and  previous  studies  demonstrated  LOH  in  stromal  cells,  we 
applied  another  technology,  SNP  arrays,  for  the  analysis  of 
isolated  epithelial  and  stromal  cells  from  a  set  of  breast  tumors. 
As  expected,  cancer  epithelial  cells  from  all  but  one  invasive 
breast  tumor  demonstrated  numerous  LOH  on  nearly  all  chro¬ 
mosomes,  while  myofibroblasts  and  other  stromal  cells  from 
the  same  tumors  appeared  to  be  mostly  normal  (Figure  2C 
and  Supplemental  Figure  S3).  Clustering  analysis  based  on  the 
inferred  LOH  data  clearly  divided  the  samples  into  two  major 
groups,  the  tumor  epithelial  and  stromal  cells  from  different 
cases  demonstrating  more  similarity  to  each  other  than  to  their 
corresponding  other  cell  type  (Figure  2C).  The  only  exception 
was  epithelial  cells  from  IDC1 0  (a  low-grade  estrogen  receptor¬ 
positive  tumor)  that  did  not  appear  to  have  major  genetic 
changes  (the  purity  of  the  tumor  epithelial  cells  was  confirmed 
by  RT-PCR,  data  not  shown),  while  in  the  phyllodes  tumor,  the 
stroma  had  numerous  genetic  alterations  with  much  fewer  LOH 
events  detected  in  the  epithelial  cells.  We  did  not  detect  signifi¬ 
cant  LOH  in  the  three  fibroadenomas  analyzed  or  in  the  one 
LCIS  (lobular  carcinoma  in  situ)  case.  Two  nonepithelial  samples 
(l-M  YOFIB-8  and  l-STR-1 3)  had  a  few  areas  where  2-5  adjacent 
SNPs  exhibited  LOH  (Figure  1C),  but  careful  examination  of 
these  SNPs  individually  suggested  that  these  LOH  calls  are 
likely  due  to  poor  hybridization  results.  In  order  to  resolve  this 
issue,  we  amplified  and  sequenced  eight  of  these  ambiguous 
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Figure  2.  Genotype  analysis  of  fractionated  normal  and  tumor  breast  tissue 

A:  Array  CGH  analysis  of  luminal  epithelial  (red  line)  and  myofibroblasts  (green  line)  cells  isolated  from  IDC-7  invasive  breast  tumor  used  for  SAGE  and  from 
adjacent  normal  tissue.  Mode  centered  segmented  data,  significant  gains  and  losses  defined  as  Log2  signal  ratio  of  greater  than  or  equal  to  +0.1 3  or  -0.13, 
respectively,  are  depicted. 

B:  Array  CGH  analysis  of  luminal  epithelial  (red  line)  and  myoepithelial  (green  line)  cells  from  DCIS-6  and  DCIS-7.  Areas  with  statistically  significant  gains  in 
the  epithelial  cells  (chromosome  17  in  the  case  of  DCIS-6  and  chromosome  20  for  DCIS-7)  are  depicted,  indicating  that  myoepithelial  cells  do  not  share 
these  changes  with  the  epithelial  cells.  No  significant  gains  and  losses  were  detected  in  any  other  areas  of  the  genome  in  the  myoepithelial  cells  (data 
not  shown). 

C:  SNP  array  analysis  of  purified  epithelial  and  stromal  cells  from  invasive  breast  carcinomas,  phyllodes  tumor,  fibroadenomas,  and  LC1S.  Samples  are 
clustered  based  on  inferred  loss  of  heterozygosity  (LOH).  All  but  one  tumor  epithelial  DNA  sample  are  clustered  together  to  the  left,  while  all  stromal  samples, 
regardless  of  their  origin,  are  clustered  together  to  the  right.  Inferred  loss  of  heterozygosity  (LOH)  is  indicated  in  blue,  yellow  indicates  regions  retaining 
heterozygosity,  and  white  regions  are  indeterminate  (noninformative).  The  names  of  DNA  samples  obtained  from  epithelial  cells  are  depicted  in  red, 
myofibroblasts  in  green,  stroma  in  yellow,  leukocytes  in  blue,  endothelial  cells  in  pink,  fibroadenoma  in  purple,  and  LCIS  in  black.  A  detailed  description  of 
the  samples  is  included  in  the  online  Supplemental  Data. 

D:  Sequence  analysis  of  two  ambiguous  SNP  cells  present  in  l-MYOFIB-8  and  in  several  controls.  For  all  cases,  the  chromatograms  of  the  sequence  reads 
and  the  SNP  array  calls  are  indicated.  One  of  the  SNPs  (rs952018)  is  on  chromosome  13q33.2,  while  the  other  one  (rsl 01 921 5)  is  on  chromosome  1 1  pi  4.3. 
As  depicted  in  the  figure  in  the  case  of  SNP  rs952018,  the  l-MYOFIB-8  sample  had  both  "A"  and  “G"  peaks  just  like  the  N-EPI-17  sample,  proving  the  retention 
of  both  alleles,  while  the  l-LEU-14  sample  was  homozygous  for  the  "G"  allele  and  the  N-for  IDC11  (normal  DNA  corresponding  to  tumor  IDC11)  was 
homozygous  for  the  “A”  allele.  Similarly  in  the  case  of  SNP  rs952018,  the  l-MYOFIB-8  sample  had  both  T  and  C  peaks  just  tike  the  N-for  IDC15  (normal  DNA 
corresponding  to  tumor  IDC15),  while  the  N-EPI-17  and  l-EPI-15  were  both  homozygous  for  the  "C"  allele. 


SNPs  from  these  two  stromal  samples  (l-MYOFIB-8  and  l-STR- 
1 3)  together  with  several  controls,  where  the  SNP  results  clearly 
depicted  heterozygous  or  homozygous  alleles.  In  all  seven 
cases  in  which  high-quality  sequencing  results  were  obtained, 
we  found  no  evidence  of  LOH  in  either  of  these  two  ambiguous 
stromal  samples  (Figure  2D). 


Evaluation  of  gene  expression 
by  immunohistochemistry  and  mRNA 
in  situ  hybridization 

The  generation  of  the  SAGE  libraries  involved  the  in  vitro  purifi¬ 
cation  of  the  cells  that  could  potentially  alter  the  in  vivo  gene 
expression  patterns,  although  prior  SAGE  data  from  several 
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laboratories  suggest  that  these  changes  are  likely  to  be  minimal 
(Porter  et  al.,  2003a,  2003b;  St  Croix  et  al.,  2000).  However,  in 
order  to  further  investigate  the  expression  of  selected  genes  at 
the  cellular  level  in  vivo,  we  performed  immunohistochemical 
analyses  and  mRNA  in  situ  hybridization  in  a  panel  of  DCIS  and 
invasive  breast  tumors  (including  tumors  used  for  SAGE  as  well 
as  additional  cases).  In  addition,  the  cell  type  specificity  of  some 
genes  was  verified  by  RT-PCR  in  the  samples  used  for  SAGE 
(data  not  shown).  Immunohistochemical  analysis  confirmed  that 
two  genes,  IL-lp  and  CCL3  (MIPIa),  are  highly  expressed  in 
leukocytes  infiltrating  DCIS,  but  not  normal  breast  tissue, 
whereas  the  PTPRC  (CD45)  panleukocyte  marker  was  ex¬ 
pressed  in  both  cases  (Figure  3A).  Despite  the  similar  number 
of  total  leukocytes  in  invasive  tumors,  the  frequency  of  IL-1|3 
and  CCL3-positive  leukocytes  was  much  lower  than  in  DCIS, 
suggesting  that  in  situ  and  invasive  breast  carcinomas  may  be 
immunologically  dissimilar.  mRNA  in  situ  hybridization  deter¬ 
mined  that  in  DCIS  tumors,  the  expression  of  PDGF  receptor 
p-like  (PDGFRBL),  cathepsin  K  (CTSK),  and  CXCL12  was  local¬ 
ized  to  myofibroblasts  as  determined  by  smooth  muscle  actin 
(ACTA2)  staining,  CXCL14  was  expressed  only  in  myoepithelial 
cells,  while  TIMP3,  cystatin  C  (CST3),  and  collagen  triple  helix 
repeat  containing  1  (CTHRC1)  were  expressed  in  both  myoepi¬ 
thelial  cells  and  myofibroblasts.  In  invasive  tumors,  all  seven 
genes  were  expressed  in  myofibroblasts.  No  signal  was  de¬ 
tected  in  normal  breast  tissue  nor  with  the  sense  probes  (Figure 
3B,  Supplemental  Figure  S4,  and  data  not  shown).  Interestingly, 
although  in  DCIS  tumors  we  detected  CXCL1 4  expression  only 
in  myoepithelial  cells,  in  some  (4/9)  invasive  breast  carcinomas, 
the  expression  of  CXCL1 4  was  restricted  to  the  tumor  epithelial 
cells  (Figures  3B  and  4A).  Similarly,  some  breast  cancer  cell 
lines  expressed  high  levels  of  CXCL12  or  CXCL14  in  vitro,  sug¬ 
gesting  that  during  tumor  progression  a  paracrine  factor  may 
be  converted  into  an  autocrine  one  due  to  its  upregulation  in 
the  tumor  epithelial  cells  (Figure  4B).  Interestingly,  all  CXCL14- 
positive  invasive  ductal  carcinomas  and  even  the  CXCL14  ex¬ 
pressing  breast  cancer  cell  line  (UACC812)  were  obtained  from 
young,  premenopausal  patients  (average  age  of  onset  39  years), 
suggesting  a  possible  association  of  CXCL14  expression  with 
hormone  levels  or  clinico-pathologic  characteristics  of  the  tu¬ 
mors,  the  analysis  of  which  requires  the  examination  of  larger 
tumor  sets. 

The  effect  of  CXCL12  and  CXCL14  chemokines 
on  breast  cancer  cells 

The  high  level  of  expression  of  two  chemokines,  CXCL12  and 
CXCL14,  in  myoepithelial  cells  and  myofibroblasts  both  in  DCIS 
and  invasive  breast  carcinomas  was  particularly  interesting  due 
to  the  known  function  of  chemokines  as  regulators  of  cell  prolif¬ 
eration,  differentiation,  migration,  and  invasion  (Gerard  and  Rol¬ 
lins,  2001;  Muller  et  al.,  2001;  Rossi  and  Zlotnik,  2000).  To 
determine  if  CXCL12  and  CXCL14  may  act  as  autocrine  and/ 
or  paracrine  factors  in  breast  tumors,  we  investigated  which 
cell  types  appear  to  have  receptors  for  these  chemokines  in  vivo 
in  primary  breast  tissue.  The  signaling  receptor  for  CXCL12 
is  CXCR4,  which  is  known  to  be  widely  expressed  in  various 
lymphoid  as  well  as  a  variety  of  epithelial  cells  (Gerard  and 
Rollins,  2001).  We  confirmed  the  expression  of  CXCR4  in 
lymphoid  and  breast  epithelial  cells  using  immunohistochemis- 
try,  while  SAGE  data  indicated  that  its  expression  is  increased 
in  invasive  tumors  compared  to  DCIS  and  normal  breast  tissue 
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Figure  3.  Validation  of  SAGE  data  using  immunohistochemistry  and  mRNA 
in  situ  hybridization  and  Northern  blot  analysis 

A:  Immunohistochemical  analysis  of  PTPRC  (CD45),  IL1  (3,  and  CCL3  expres¬ 
sion  in  normal  DCIS,  and  invasive  cancer  breast  tissue.  Black  signal  indicates 
expression  of  the  indicated  proteins  in  leukocytes.  Methyl  green  was  used 
to  stain  the  nuclei  to  visualize  tissue  histology.  Magnification  is  100X. 

B:  mRNA  in  situ  hybridization  analysis  of  the  indicated  genes  using  antisense 
ribo-probes  in  a  panel  of  normal  DCIS,  and  invasive  breast  cancer  tissue. 
Red  (PDGFRL,  CTSK,  CTHRC1 ,  TIMP3,  CST3,  and  CXCL12)  or  black  (CXCL14 
and  IGFBP7)  staining  indicates  the  presence  of  the  mRNA  depending  on 
the  hybridization  protocol  used.  Paraffin  sections  were  analyzed  for  ACTA2 
{smooth  muscle  actin)  expression  by  immunohistochemistry  to  confirm  the 
identity  of  myoepithelial  cells  and  myofibroblasts.  Brown  staining  indicates 
the  expression  of  SMA  in  myoepithelial  cells  and  myofibroblasts.  Magnifica¬ 
tion  is  lOOx.  More  detailed  images  with  higher  (200 x)  magnification  are 
included  in  Supplemental  Data  (Supplemental  Figure  $4). 


(data  not  shown).  The  signaling  receptor  of  CXCL1 4  is  unknown, 
but  cell  surface  ligand  binding  experiments  have  suggested  the 
presence  of  a  putative  CXCL14  receptor  on  monocytes  and  B 
cells,  suggesting  that  its  receptor  is  not  likely  to  be  CXCR4 
(Kurth  et  al.,  2001;  Sleeman  et  al.,  2000).  To  determine  if  a 
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Figure  4.  CXCL1 4  expression  in  primary  breast  tumors  and  breast  cancer  cell  lines 

A:  mRNA  in  situ  hybridization  using  CXCL14  antisense  ribo-probe  in  multiple  DCIS  and  invasive  breast  carcinomas  including  the  tumors  used  for  SAGE 
(DCIS-7).  Black/purple  staining  indicates  the  presence  of  the  CXCL14  mRNA,  while  nuclei  were  stained  with  nuclear  FastRed  to  visualize  tissue  histology. 
The  names  of  the  tumor  samples  are  indicated  above/below  of  the  pictures.  In  DCIS  cases,  CXCL14  is  expressed  only  in  myoepithelial  cells,  while  in  some 
invasive  breast  carcinomas  (CT22  and  CT25),  strong  expression  is  observed  in  tumor  epithelial  cells. 

B:  Northern  blot  analysis  of  CXCL1 2,  CXCL1 4,  and  CXCR4  expression  in  the  indicated  breast  cancer  cell  lines,  breast  organoids  (ORG1-1 0,  uncultured  breast 
ducts  from  normal  breast  tissue),  and  primary  breast  tumor  CT22.  Hybridization  with  p-actin  (ACTB)  was  used  as  a  control  for  loading.  Confirming  the  mRNA 
in  situ  hybridization  data,  strong  CXCLI4  expression  is  detected  in  tumor  CT22,  similarly  in  SUM-229  and  UACC812  breast  cancer  cell  lines. 


CXCL14  binding  cell  surface  protein(s)  is  also  present  on  breast 
cancer  cells,  we  generated  an  alkaline  phosphatase-CXCL14 
(AP-CXCL14)  fusion  protein  to  be  used  as  a  ligand  in  receptor 
binding  assays.  Conditioned  media  of  AP-CXCL14  or  control 
AP  expressing  cells  was  then  used  as  an  affinity  reagent  to  stain 
normal  and  cancerous  mammary  tissue  sections  including  the 
DCIS  tumors  used  for  SAGE.  Blue  staining  indicated  the  pres¬ 
ence  of  a  CXCL14  binding  protein  in  certain  leukocytes  and 
breast  epithelial  cells  (Figure  5A).  These  results  suggest  the 
presence  of  a  cell  surface  CXCL14  binding  protein(s)  in  cancer¬ 
ous  and  normal  mammary  epithelial  cells  and  are  consistent 
with  a  paracrine  mechanism  of  CXCL14  action  in  the  breast. 
To  test  further  the  binding  characteristics  of  AP-CXCL14,  we 
performed  in  vitro  ligand  binding  assays  using  various  ceil  lines. 
Low-level  AP-CXCL14  binding  was  detected  in  all  cell  lines 
tested,  including  MDA-MB-231  and  MDA-MB-435  breast  can¬ 
cer  and  MCF10A  immortalized  mammary  epithelial  cells  (data 
not  shown).  To  further  characterize  the  AP-CXCL1 4-putative 
CXCL1 4  receptor  interaction,  we  performed  more  detailed  bind¬ 
ing  assays  in  MDA-MB-231  breast  cancer  cells.  Scatchard  plot 
analysis  showed  two  binding  slopes  in  MDA-MB-231  cells  indi¬ 
cating  the  presence  of  high-affinity  (Kd  =  6.1  x  10~8  M)  and 
low-affinity  (Kd  =  56.7  x  10-8  M)  binding  sites  (Figure  5B). 

In  previous  studies,  CXCL12  was  demonstrated  to  enhance 
breast  cancer  cell  growth,  migration,  and  invasion  (Hall  and 
Korach,  2003;  Muller  et  al.,  2001).  In  order  to  determine  if 


CXCL14  has  similar  effects,  we  tested  the  effect  of  conditioned 
medium  containing  AP-CXCL14  on  the  growth  of  MDA-MB-231 
and  MCF1 0A  cells,  while  its  effect  on  cell  migration  and  invasion 
was  investigated  in  MDA-MB-231  cells.  Conditioned  medias 
of  cells  transfected  with  AP  alone  and  CXCL12  were  used  as 
negative  and  positive  controls,  respectively.  Similar  to  CXCL12, 
CXCL14  enhanced  the  proliferation  of  MDA-MB-231  and 
MCF10A  cells  and  the  migration  and  invasion  of  MDA-MB-231 
cells  (Figures  5C  and  5D,  and  data  not  shown).  The  concentra¬ 
tion  of  AP-CXCL14  was  2-30  nM  in  these  experiments,  which 
is  similar  to  the  concentration  required  by  several  chemokines, 
including  CXCL12,  to  exert  biological  effects.  The  same  results 
were  obtained  in  cell  migration  and  invasion  assays  using 
CXCL14-AP  (C-terminal  AP-tag)  and  CXCL14-HA  (C-terminal 
HA-tag)  fusion  proteins  (Figure  5D  and  data  not  shown);  thus, 
the  observed  effects  are  not  likely  to  be  due  to  the  position  or 
identity  of  the  epitope  tag.  Preliminary  results  using  recombinant 
CXCL14  protein  and  CXCL14  expressing  adenovirus  demon¬ 
strated  possible  induction  of  calcium  flux  in  MDA-MB-231  and 
activation  of  AKT  in  MCF10A  cells,  respectively  (data  not 
shown),  further  suggesting  that  mammary  epithelial  cells  have 
a  functional  CXCL14  receptor. 

To  determine  if  paracrine  factors,  including  CXCL14,  se¬ 
creted  by  DCIS  myoepithelial  cells  may  influence  the  prolifera¬ 
tion  of  tumor  epithelial  cells  in  vivo,  we  analyzed  the  expression 
of  Ki67,  a  marker  of  cell  proliferation,  in  the  two  DCIS  samples 
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Figure  5.  Analysis  of  CXCL14  ligand  binding  characteristics  and  function 

A:  Identification  of  a  putative  CXCL1 4  receptor  in  breast  epithelial  cells  using  an  AP  (alkaline  phosphatase)-CXCL14  fusion  protein  as  ligand.  Blue  staining 
reflecting  AP  activity  indicates  binding  of  AP-CXCL14  to  breast  epithelial  cells  and  some  stromal  leukocytes,  while  no  staining  is  detected  with  the  AP  alone 
negative  control.  All  these  tumor  samples  were  also  analyzed  for  the  expression  of  CXCL14  by  mRNA  in  situ  hybridization  (Figures  3B  and  4A)  and  were 
expressing  CXCL14  in  tumor  epithelial  cells  (CT22  and  CT25)  and  DC1S  myoepithelial  cells  (T18  and  T25).  Images  were  taken  with  10X  and  20x  objectives 
(100X  and  200x  magnification). 

B:  Scatchard  transformation  of  AP-CXCL14  binding  assays  in  MDA-MB-231  cells.  Red  and  black  colored  lines  indicate  high  (Kd  =  6.1  x  10~8  M)  and  low 
(Kd  =  56.7  x  1O“0  M)  -affinity  binding  slopes,  respectively. 

C:  The  effect  of  CXCL1 2  and  AP-CXCL1 4  on  the  growth  of  MDA-MB-231  breast  cancer  and  MCF10A  immortalized  breast  epithelial  cells  (red  lines)  compared 
to  AP  and  control  media  (black  lines).  Representative  result  of  experiments  performed  in  triplicate. 

D:  The  effect  of  CXCL12,  CXCL14,  and  10%  fetal  bovine  serum  (FBS)  on  the  migration  and  invasion  of  MDA-MB-231  breast  cancer  cells.  The  number  of  cells 
that  crossed  the  uncoated  (migration)  or  Matriget-coated  membranes  (invasion)  is  indicated  on  the  y  axis.  Representative  result  of  experiments  performed 
in  triplicate. 

E:  Immunohistochemical  analysis  of  Ki67  expression  in  DCIS-6  and  DCIS-7  samples  to  identify  proliferating  cells.  Images  were  taken  with  lOx  and  20x 
objectives  (100X  and  200x  magnification).  Ki67  is  expressed  in  all  phases  of  the  cell  cycle  except  in  noncycling  (G0)  cells.  Tumor  epithelial  cells  adjacent 
to  the  myoepithelial  cell  layer  are  more  frequently  positive  than  their  more  centrally  located  counterparts. 


used  for  SAGE  (Figure  5E).  In  both  cases,  epithelial  cells  adja¬ 
cent  to  the  myoepithelial  cell  layer  were  more  frequently  positive 
for  Ki67  than  tumor  epithelial  cells  in  other  parts  of  the  ducts. 
This  result  suggests  that  tumor  epithelial  cells  may  receive  para¬ 
crine  signals  from  adjacent  myoepithelial  cells  that  enhance 
their  proliferation,  although  other  reasons  for  this  intraductal 
location-dependent  proliferation  difference  cannot  be  excluded. 
Correlating  with  this,  a  recent  study  described  that  the  gene 
expression  profile  of  tumor  epithelial  cells  located  at  the  periph¬ 
ery  and  the  center  of  DCIS  ducts  is  significantly  different  (Zhu 
et  al.,  2003). 


Discussion 

Epithelial-mesenchymal  interactions  are  known  to  be  important 
for  the  normal  development  of  the  mammary  gland  and  to  play 
a  role  in  breast  tumorigenesis  (Bissell  et  al.,  2002;  Coussens 
and  Werb,  2002;  Kenny  and  Bissell,  2003;  Radisky  et  al.,  2001; 
Shekhar  et  al.,  2003;  Tlsty,  2001 ;  Tlsty  and  Hein,  2001 ;  Wiseman 
and  Werb,  2002).  Early  studies  demonstrated  that  the  normal 
mammary  microenvironment  is  capable  of  “reverting”  the  neo¬ 
plastic  phenotype  of  breast  cancer  cells  by  inducing  cellular 
differentiation  (DeCosse  et  al.,  1973,  1975),  suggesting  that 
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cancer  cells  can  thrive  only  in  a  distorted  environment  or  have 
to  become  independent  of  extracellular  signals.  The  contribution 
of  genetic  host  factors  to  tumor  initiation,  progression,  and  angi¬ 
ogenesis  also  support  a  role  for  nonepithelial  cells  in  carcinogen¬ 
esis  (Hunter,  2004;  Rohan  et  al.,  2000).  This  was  dramatically 
illustrated  by  the  finding  that  inactivation  of  TGF-ptype  II  recep¬ 
tor  in  stromal  fibroblasts  led  to  prostate  and  gastric  epithelial 
neoplasia  (Bhowmick  et  al.,  2004).  Similarly,  a  recent  finding 
demonstrating  that  mammary  tumors  only  formed  in  cleared 
mammary  fat  pads  of  rats  treated  with  carcinogen,  regardless  of 
whether  the  injected  epithelial  cells  were  treated  with  carcinogen 
in  vitro,  also  emphasizes  the  importance  of  stromal  alterations 
in  the  initiating  steps  of  breast  cancer  (Maffini  et  al.,  2004). 
Numerous  in  vitro  and  in  vivo  studies  using  diverse  experimental 
systems  have  demonstrated  that  the  growth,  survival,  polarity, 
and  invasive  behavior  of  breast  cancer  cells  can  be  modulated 
by  myoepithelial  and  various  stromal  cells,  and  several  genes 
have  been  implicated  to  play  an  important  role  in  this  process 
(Bissell  et  al.,  2002;  Coussens  and  Werb,  2002;  Deugnier  et  al., 
2002;  Elenbaas  and  Weinberg,  2001;  Gudjonsson  et  al.,  2002; 
Kenny  and  Bissell,  2003;  Radisky  et  al.,  2001;  Shekhar  et  al., 
2003;  Sternlicht  and  Barsky,  1997;  Tlsty,  2001;  Tlsty  and  Hein, 
2001 ;  Wiseman  and  Werb,  2002).  However,  comprehensive  mo¬ 
lecular  analysis  of  all  cell  types  that  compose  normal  human 
mammary  breast  tissue  and  breast  carcinomas  has  not  been 
performed. 

With  the  aim  of  delineating  epithelial-stromal/myoepithelial 
cell  interactions  at  the  molecular  level,  we  determined  the  com¬ 
prehensive  gene  expression  and  genomic  profiles  of  epithelial, 
myoepithelial,  and  stromal  cells  in  normal  breast  tissue  and  in 
situ  and  invasive  breast  carcinomas.  Our  results  confirm  at  the 
molecular  level  that  the  cellular  microenvironment  is  dramati¬ 
cally  different  between  normal  breast  tissue  and  breast  carcino¬ 
mas  and  that  this  is  already  evident  at  the  in  situ  carcinoma 
stage.  Based  on  our  gene  expression  data,  we  determined  that 
the  most  dramatic  and  consistent  changes  occur  in  myoepithe¬ 
lial  cells  and  myofibroblasts  and  the  majority  of  the  differentially 
expressed  genes  encode  secreted  and  cell  surface  proteins 
(Tables  1  and  2  and  Suppiemental  Tables  S2  and  S5).  Since 
previous  data  also  implicated  these  two  cell  types  in  breast 
tumor  progression,  particularly  in  the  transition  of  in  situ  to 
invasive  carcinomas  (Alpaugh  et  al.,  2000;  Barsky,  2003;  Chau- 
han  et  al.,  2003;  Nguyen  et  al.,  2000;  Shao  et  al.,  1 998;  Sternlicht 
and  Barsky,  1997;  Sternlicht  et  al.,  1997;  Walter-Yohrling  et 
al.,  2003),  we  mainly  focused  on  tumor  myoepithelial  cells  and 
myofibroblasts  and  the  genes  expressed  by  them. 

Myoepithelial  cells  play  a  major  role  in  the  formation  of  the 
basement  membrane  and  lactation  due  to  their  expression  of 
type  IV  collagen,  laminin,  smooth  muscle  actin,  and  oxytocin 
receptor  (Gudjonsson  et  al.,  2002;  Murrell,  1995).  They  also 
have  been  suggested  to  suppress  breast  cancer  cell  growth, 
invasion,  and  angiogenesis  via  shedding  of  CD44  and  expres¬ 
sion  of  protease  inhibitors  (Alpaugh  et  al.,  2000;  Barsky,  2003; 
Xiao  et  al.,  1999).  On  the  other  hand,  myoepithelial  cells  are 
also  important  for  the  survival,  differentiation,  and  polarity  of 
normal  luminal  epithelial  cells  (Gomm  et  al.,  1997a,  1997b). 
Proteomic  and  mRNA  expression  profiling  of  short-term  cul¬ 
tured  myoepithelial  cells  and  myoepithelial  cell  lines,  respec¬ 
tively,  gave  a  glimpse  of  the  molecular  basis  for  the  tumor  and 
invasion  suppressor  role  of  normal  myoepithelium  (Barsky, 
2003;  Page  et  al.,  1999).  Our  SAGE-based  profiling  of  freshly 


isolated,  uncultured  myoepithelial  cells  from  normal  breast  tis¬ 
sue  also  demonstrated  the  high  expression  of  laminin,  tenascin, 
thrombospondin,  and  PAI-1  binding  protein.  However,  the  ex¬ 
pression  of  these  genes  was  downregulated  in  DCIS  myoepithe¬ 
lial  cells  similar  to  that  of  cytokeratins  7,  14,  and  17,  oxytocin 
receptor,  and  tropomyosin,  suggesting  that  DCIS  myoepithelial 
cells  are  phenotypicaily  altered  and  less  differentiated  than  nor¬ 
mal  myoepithelial  cells.  Keeping  with  this,  several  recent  studies 
described  a  lack  of  commonly  used  myoepithelial  markers  (in¬ 
cluding  CD10  and  SMA)  in  a  subset  of  morphologically  distinct 
myoepithelia!  cells,  suggesting  that  myoepithelial  cells  may  also 
be  subject  to  pathological  alterations  (Zhang  et  al.,  2003).  More¬ 
over,  in  support  of  a  role  for  myoepithelial  cells  in  breast  tumor 
progression,  it  was  recently  reported  that  DCIS  tumor  epithelial 
cells  adjacent  to  a  disrupted  myoepithelial  cell  layer  are  molecu- 
larly  and  genetically  different  from  their  more  distant  counter¬ 
parts  (Man  et  al.,  2003). 

Myofibroblasts  are  stromal  fibroblasts  with  features  of  both 
myoblasts  (e.g.,  expression  of  smooth  muscle  actin)  and  fibro¬ 
blasts  that  have  been  implicated  in  breast  cancer  invasion,  ex¬ 
tracellular  matrix  remodeling,  wound  healing,  and  chronic  in¬ 
flammation  (De  Wever  and  Mareel,  2003;  Gabbiani,  1999; 
Schurch,  1999).  The  cell  type  of  origin  of  myofibroblasts  is  not 
conclusively  established.  Certain  cytokines  can  induce  (TGF-p) 
or  inhibit  (IFN-7)  the  transformation  of  fibroblasts  into  myo¬ 
fibroblasts  in  vitro  (De  Wever  and  Mareel,  2003;  Tanaka  et  al., 
2003),  while  PDGF-B  stimulates  the  proliferation  of  fibrocytes 
and  their  conversion  into  myofibroblasts  in  vivo  (Oh  et  al.,  1 998). 
Isolation  of  various  stromal  and  epithelial  cells  from  breast  tu¬ 
mors  and  their  coculturing  in  vitro  demonstrated  that  cancer 
epithelial  cells  can  induce  the  expression  of  myofibroblast  mark¬ 
ers  in  a  subset  of  fibroblasts  (Ronnov-Jessen  et  al.,  1995).  How¬ 
ever,  the  finding  that  only  a  small  fraction  of  fibroblasts  were 
transformed  into  myofibroblasts  (Ronnov-Jessen  et  al.,  1995) 
raises  the  question  of  whether  myofibroblasts  could  be  derived 
from  specific  stem  cells  that  are  normally  present  in  the  breast 
or  in  the  bone  marrow  and  are  growth  stimulated  or  recruited 
by  adjacent  cancer  epithelial  cells.  Recent  data  both  in  animal 
models  and  human  breast  tumors  support  the  hypothesis  that 
at  least  a  subset  of  cancer-associated  myofibroblasts  is  derived 
from  circulating  bone-marrow  derived  cells  (Chauhan  et  al., 
2003;  Ishii  et  al.,  2003).  Our  finding  that  the  gene  expression 
profiles  of  myofibroblasts  isolated  from  different  invasive  breast 
carcinomas  are  highly  similar  also  suggest  their  common  cell 
type  of  origin. 

Two  genes  highly  expressed  in  tumor  myoepithelial  cells 
and  myofibroblasts  encoding  chemokines  CXCL1 2  and  CXCL1 4 
were  particularly  interesting  due  to  the  recently  demonstrated 
role  of  chemokines  and  chemokine  receptors  in  cancer  cell 
growth,  invasion,  and  metastasis  (Barbero  et  al.,  2002;  Chen  et 
al.,  2003;  Hall  and  Korach,  2003;  Kang  et  al.,  2003;  Muller  et 
a!.,  2001;  Scotton  et  al.,  2002).  CXCL12  has  been  previously 
implicated  in  breast  cancer  metastasis  (Kang  et  al.,  2003;  Muller 
et  al.,  2001),  but  its  high  expression  in  DCIS  (a  pre-invasive 
tumor)  myofibroblasts  suggests  that  it  might  have  additional 
roles  in  the  earlier  stages  of  breast  tumorigenesis.  Correlating 
with  this  hypothesis,  CXCL12  was  recently  identified  as  a  tran¬ 
scriptional  target  of  the  estrogen  receptor  that  mediates  estro¬ 
gen-induced  proliferation  of  breast  cancer  cells  (Hall  and  Kor¬ 
ach,  2003).  Relatively  little  is  known  about  the  CXCL14 
chemokine  despite  the  fact  that  it  was  independently  identified 
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by  multiple  labs  using  different  approaches.  The  high  expression 
of  CXCL14  in  inflammatory  cells  in  multiple  cancer  types  and 
its  selectivity  to  monoctyes  may  suggest  a  role  in  macrophage 
development  (Frederick  et  al.,  2000;  Hromas  et  al.f  1999;  Kurth 
et  al.,  2001;  Sleeman  et  al.,  2000).  Although  the  receptor  for 
CXCL1 4  has  not  been  identified,  the  induction  of  calcium  mobili¬ 
zation  by  recombinant  CXCL14  in  monocytes  suggests  that 
similar  to  other  chemokines,  it  is  also  likely  to  signal  via  a  G 
protein-coupled  receptor.  Our  preliminary  results  demonstrating 
intracellular  calcium  flux  in  MDA-MB-231  breast  cancer  cells 
also  support  this  hypothesis. 

In  addition  to  phenotypic  alterations,  several  recent  studies 
described  genetic  changes  (including  LOH  and  mutations  in 
tumor  suppressor  genes)  in  stromal  cells  adjacent  to  breast 
cancer  cells  (Kurose  et  al.,  2001,  2002;  Lakhani  et  al.,  1998; 
Moinfar  et  al.,  2000;  Wernert  et  al.,  2001).  Loss  of  heterozygosity 
at  several  loci  has  also  been  demonstrated  in  normal-appearing 
epithelial  cells  adjacent  to  breast  carcinomas  and  short-term 
cultured  luminal  and  myoepithelial  cells  (Deng  et  al.,  1 996;  Forsti 
et  al.,  2001;  Lakhani  et  al.,  1999;  Moinfar  et  al.,  2000).  In  several 
cases,  the  tumor  epithelial  and  stromal  cells  had  discordant 
genetic  changes,  suggesting  a  clonal  co-evolution  for  these  two 
cell  types.  Moreover,  due  to  the  low  probability  of  two  adjacent 
cells  simultaneously  acquiring  genetic  changes,  this  would  also 
suggest  that  breast  cancer  epithelial  and  stromal  cells  may  be 
derived  from  a  common  stem  cell  and  then  undergo  a  divergent 
genetic  selection  process. 

In  order  to  determine  if  in  the  same  population  of  tumor 
epithelial,  myoepithelial,  and  stromal  cells  in  which  we  detected 
dramatic  gene  expression  changes  there  are  also  underlying 
genetic  alterations,  we  analyzed  the  genotype  of  these  cell  types 
using  different  technologies  in  2  DCIS  and  12  invasive  breast 
carcinomas.  All  but  one  of  the  tumor  epithelial  cells  had  numer¬ 
ous  LOH  involving  almost  all  chromosome  arms.  The  most  fre¬ 
quent  genetic  changes  we  identified  in  the  tumor  epithelial  cells 
(1  q,  8q,  1 7q,  and  20q  gain,  and  6q,  8p,  1 0q,  1 2q,  and  1 7p  LOH), 
both  in  DCIS  and  invasive  tumors,  were  in  agreement  with  that 
of  prior  studies  (Nishizaki  et  al.,  1997;  Waldman  et  al.,  2000). 
The  one  tumor  DNA  sample  (IDC1 0)  that  appeared  to  be  devoid 
of  significant  LOH  was  obtained  from  a  low-grade  estrogen  and 
progesterone  receptor-positive,  HER2-negative  breast  tumor. 
The  lack  of  gross  chromosomal  changes  in  this  tumor  is  unlikely 
to  be  due  to  technical  issues,  but  potentially  reflects  a  special 
pathway  of  breast  tumorigenesis.  Correlating  with  this,  an  inde¬ 
pendent  study  using  BAC  array  CGH  analysis  of  a  large  set  of 
breast  tumors  also  found  that  a  subset  of  breast  tumors  (9/1 46) 
have  minimal  (<1.5%  of  the  genome)  chromosomal  changes 
(Dr.  J.  Gray,  Lawrence  Berkeley  National  Laboratory,  personal 
communication).  Using  three  different  methods  (aCGH,  SNP 
arrays,  and  direct  sequencing  of  specific  SNPs),  we  detected 
genetic  changes  only  in  cancer  epithelial  cells.  However,  in  a 
malignant  phyllodes  tumor  that  is  thought  to  be  composed  of 
malignant  stroma  and  epithelium,  we  detected  LOH  in  both 
components.  These  results  suggest  that  using  the  technologies 
we  applied,  genetic  changes  can  be  detected  both  in  epithelial 
and  stromal  cells,  but  only  if  there  is  a  mono-  or  oligoclonal 
proliferation  of  neoplastic  epithelial  or  stromal  cells.  Our  inability 
to  find  conclusive  genetic  alterations  in  stromal  cells  from  inva¬ 
sive  ductal  breast  carcinomas  is  seemingly  in  disagreement 
with  the  findings  of  several  of  the  above  referenced  studies. 
However,  we  believe  that  the  reason  for  the  different  results 


could  be  due  to  the  use  of  different  technologies  and  ap¬ 
proaches.  All  the  studies  that  described  LOH  in  cancer  stroma 
analyzed  a  few  polymorphic  markers  and  a  fairly  small  popula¬ 
tion  of  stromal  cells  isolated  by  microdissection  from  the  same 
area  adjacent  to  tumor  epithelial  cells,  while  we  analyzed  all  the 
stromal  cells  from  the  tumor  and  used  comprehensive  genome¬ 
wide  SNP  arrays.  Thus,  if  the  stromal  cells  are  highly  heteroge¬ 
neous  with  respect  to  genetic  alterations,  these  changes  can 
be  detected  only  if  relatively  few  cells  from  the  same  area  of 
the  tumor  are  analyzed.  However,  in  our  view  this  argues  against 
the  hypothesis  that  the  genetic  changes  in  the  stroma  are  se¬ 
lected  for  and  thus  play  a  major  role  in  tumorigenesis. 

In  summary,  this  study  provides  a  comprehensive  molecular 
characterization  of  each  cell  type  composing  normal  breast 
tissue  and  in  situ  and  invasive  breast  carcinomas.  The  genes 
described  here  should  therefore  provide  a  valuable  resource  for 
future  basic  and  clinical  studies  addressing  the  role  of  epithelial- 
stromal  cell  interactions  in  breast  and  other  cancer  types.  The 
availability  of  specific  chemokine  receptor  inhibitors  and  preclin- 
ical  studies  demonstrating  dramatic  tumor  and  metastasis  sup¬ 
pressive  effects  using  CXCR4  inhibitors  in  brain  and  breast 
tumors  (Rubin  et  al.,  2003;  Tamamura  et  al.,  2003)  provide  a 
proof  of  principle  that  therapeutic  targeting  of  chemokines  is  a 
promising  new  opportunity  for  the  treatment  of  breast  carci¬ 
nomas. 

Experimental  procedures 
Cell  lines  and  tissue  specimens 

Breast  cancer  cell  lines  were  obtained  from  American  Type  Culture  Collection 
(Manassas,  VA)  or  were  generously  provided  by  Drs.  Steve  Ethier  (University 
of  Michigan)  and  Arthur  Pardee  (Dana-Farber  Cancer  Institute).  Cells  were 
grown  in  media  recommended  by  the  provider.  Tumor  specimens  were 
obtained  from  Brigham  and  Women’s  and  Massachusetts  General  Hospitals 
(Boston,  MA),  Duke  University  (Durham,  NC),  University  Hospital  Zagreb 
(Zagreb,  Croatia),  and  the  National  Disease  Research  Interchange,  snap 
frozen  on  dry  ice,  and  stored  at  -80°C  until  use  or  were  processed  for 
purification  as  described  below.  All  human  tissue  was  collected  using  proto¬ 
cols  approved  by  the  Institutional  Review  Boards.  We  purified  all  the  cell 
types  from  2  different  normal  reduction  mammoplasty  tissues,  2  different 
DCIS,  and  13  different  invasive  ductal  carcinomas.  Due  to  technical  difficul¬ 
ties  (insufficient  number  of  cells),  we  were  not  able  to  generate  SAGE  libraries 
from  each  cell  type  of  each  tissue  used  for  purification.  In  addition,  selected 
cell  types  were  isolated  from  a  few  additional  normal  and  DCIS  samples. 
The  detailed  protocol  used  for  the  purification  of  all  cell  types  is  included 
in  the  Supplemental  Data.  The  estimated  number  of  cells  obtained  from 
each  fraction  varied  from  10,000  to  100,  000. 

Generation  and  analysis  of  SAGE  libraries,  mRNA  in  situ 
hybridization,  and  immunohistochemistry 

All  SAGE  libraries  were  generated  using  a  modified  micro-SAGE  protocol 
and  the  l-SAGE  (libraries  prepared  in  2002)  or  long  l-SAGE  (l-epi-7,  l-epi-8, 
l-epi-9,  l-leu-7,  l-str-7,  l-myofib-7,  l-myofib-8,  l-myofib-9,  D-str-6,  FA,  PHY) 
kits  from  Invitrogen.  The  samples  were  collected  and  SAGE  libraries  were 
generated  during  2002-2004,  and  the  long-SAGE  kit  became  available  only 
in  2003.  SAGE  libraries  were  sequenced  by  Agencourt  (Beverly,  MA)  as 
part  of  the  NCI-CGAP  SAGE  project,  and  all  data  will  be  deposited  to  the 
SAGEGenie  website  (http://cgap.nci.nih.gov/SAGE).  Approximately  50,000 
tags  (average  tag  number  56,647  ±  4,383)  were  obtained  from  each  library, 
and  the  preliminary  analysis  of  the  SAGE  data  was  performed  essentially 
as  described  (Porter  et  al.,  2001).  Briefly,  genes  significantly  (p  <  0.002) 
differentially  expressed  between  normal  and  cancerous  cells  were  identified 
by  performing  pair-wise  comparisons  using  the  SAGE2000  software  and 
Monte  Carlo  analysis.  Significance  calculation  among  groups  of  SAGE  librar¬ 
ies  and  clustering  analyses  were  performed  using  a  new  Poisson  model- 
based  K-means  algorithm  (PK  algorithm,  Cai  et  al.,  2004).  A  detailed  descrip- 


12 


CANCER  CELL  :  JULY  2004 


ARTICLE 


tion  of  the  methodology  used  for  the  analysis  and  clustering  of  the  SAGE 
data  is  provided  in  the  Supplemental  Data.  Probes  for  the  selected  genes 
to  be  used  for  mRNA  in  situ  hybridization  were  generated  by  PCR  amplifica¬ 
tion  of  a  300-500  bp  region  of  the  3'UTR  and  subcloning  the  fragments  into 
pZEROl.O  (Invitrogen).  The  identity  of  the  subcloned  PCR  products  was 
confirmed  by  sequencing,  and  the  resulting  plasmids  were  used  for  the 
generation  of  digitonin-labeled  riboprobes  essentially  as  described  (Porter 
et  at.,  2003a).  mRNA  in  situ  hybridizations  and  immunohistochemistry  were 
performed  as  described  or  as  recommended  by  the  antibody  supplier  (Porter 
et  al.,  2003a).  Mouse  monoclonal  antibodies  for  ILIp  and  CCL3  were  pur¬ 
chased  from  R&D,  while  anti-CD45  and  anti-Ki67  mouse  monoclonal  anti¬ 
bodies  were  obtained  from  DAKO. 

Array  comparative  genomic  hybridization 

cDNA  array  comparative  genomic  hybridization  using  Agilent  (Palo  Alto, 
California)  arrays  were  performed  by  the  Belfer  Genome  Center  at  the  Dana- 
Farber  Cancer  Institute.  Genomic  DNA  was  digested  with  Dpnll  and  random 
prime  labeled  according  to  standard  protocols  with  slight  modifications 
(Pollack  et  al.,  1999).  (For  a  detailed  protocol,  see  http://genomic.dfci.harvard. 
edu/array_cgh.htm.)  Labeled  DNAs  were  hybridized  to  human  cDNA  mi¬ 
croarrays  containing  12,814  unique  cDNA  clones  (Agilent  Technologies,  Hu¬ 
man  1  clone  set).  Among  these  clones,  approximately  9,420  unique  map 
positions  were  found  for  12,020  unique  GenBank  sequences.  The  median 
interval  between  cDNAs  is  100.1  kilobase,  92.8%  of  intervals  are  less  than 
1  megabase,  and  98.6%  are  less  than  3  megabases.  The  density  of  coverage 
is  closely  correlated  with  gene  density.  Following  extensive  QA  analysis, 
fluorescence  ratios  of  scanned  images  of  the  arrays  were  calculated  and 
the  raw  array  CGH  profiles  were  processed  to  identify  statistically  significant 
transitions  in  copy  number  using  a  segmentation  algorithm  that  employs 
permutation  to  determine  the  significance  of  change  points  in  the  raw  data. 
By  mode  centering  this  segmented  data  set,  we  defined  gains  and  losses 
as  Log2  signal  ratio  of  greater  than  or  equal  to  +0.13  or  -0.13,  respectively, 
and  amplification  and  deletion  as  a  ratio  greater  than  0.52  or  less  than  -0.58, 
respectively  (e.g.,  97%  or  3%  quantiles).  Statistical  analysis  of  the  aCGH  data 
will  be  described  in  detail  elsewhere  (Brennan  et  al.,  2004).  Segmentation 
of  aCGH  profiles  was  performed  by  changepoint  identification  algorithm 
provided  by  Adam  Olshen  and  E.S.  Venkatraman  (Lucito  et  a!.,  2003). 

Single  nucleotide  polymorphism  array  analysis 

SNP  array  hybridizations  were  performed  by  the  Dana-Farber  Microarray 
Core  using  Affymetrix  11K  Xbal  SNP  arrays  and  protocols  recommended 
by  Affymetrix  (Santa  Clara,  CA).  These  arrays  contain  probes  for  both  alleles 
at  11,565  SNP  loci,  with  mean  and  median  intermarker  distances  of  209  kb 
and  1 04  kb,  respectively,  with  probe  density  closely  correlating  with  gene 
density  (Matsuzaki  et  al.,  2004).  Arrays  were  scanned  using  a  confocal  laser 
scanner  (Agilent,  Palo  Alto,  California),  and  Affymetrix  genotyping  software 
(Affymetrix  GeneChip  5.0)  was  used  to  make  allele  calls  for  all  loci.  These 
data  were  then  analyzed  using  dChipSNP  (Lieberfarb  et  al.,  2003;  Lin  et  al., 
2004).  Loci  that  were  heterozygous  (AB  call)  in  normal  epithelium  or  leuko¬ 
cytes  but  homozygous  (AA  or  BB  call)  in  the  test  tissue  were  identified  as 
potentially  having  undergone  LOH.  Some  of  these  potential  LOH  events 
reflect  genotyping  error,  but  serial  events  among  neighboring  loci  along  a 
chromosome  likely  reflect  true  regional  LOH.  Regions  of  statistically  likely 
LOH  were  delineated  according  to  a  Hidden  Markov  Model  analysis;  the 
detailed  method  will  be  described  elsewhere.  Two  samples  (LCIS  and  FA) 
had  no  normal  reference  counterparts.  For  these,  regions  of  LOH  were 
inferred  when  a  stretch  of  consecutive  homozygous  loci  exceeded  what 
would  be  expected  by  chance  alone.  Again,  a  Hidden  Markov  Model  analysis 
was  used,  assigning  a  marginal  probability  of  heterozygosity  of  0.37  to 
correspond  to  the  actual  rate  found  in  these  SNPs  (Matsuzaki  et  al.,  2004) 
and  a  transition  probability  between  consecutive  SNPs  proportional  to  the 
genetic  distance  between  them  (Lander  and  Green,  1987).  A  detailed  de¬ 
scription  of  the  method  will  be  presented  in  a  forthcoming  study  (M.  Lin, 
R.B.,  X.  Zhao,  M.  Meyerson,  C.  Li,  and  W.R.S.,  unpublished  data).  Samples 
were  clustered  hierarchically  as  previously  described  (Lin  et  al.,  2004),  based 
upon  LOH  calls  in  all  statistically  likely  regions  of  LOH  in  all  chromosomes. 

Ligand  binding,  cell  growth,  migration,  and  invasion  assays 
We  generated  N-terminal  or  C-terminal  alkaline  phosphatase  (AP)  CXCL14 
fusion  proteins  using  the  AP-TAG-5  expression  vector  (GenHunter,  Nashville, 


TN).  We  transfected  mammalian  cells  with  Fugene6  (Roche,  Indianapolis, 
IN),  Lipofectamine,  or  Lipofectamine  2000  (LifeTechnologies,  Rockville,  MD) 
reagents.  We  performed  in  vivo  and  in  vitro  ligand  binding  assays  on  primary 
tissues  and  cell  lines  using  AP-CXCL14  essentially  as  described  (Flanagan 
and  Leder,  1990;  Porter  et  al.,  2003b).  Briefly,  we  fixed  frozen  sections  of 
various  human  specimens,  incubated  with  either  AP-CXCL1 4  fusion  protein 
or  AP  control  conditioned  medium,  rinsed,  and  then  incubated  with  AP 
substrate  forming  a  blue/purple  precipitate.  For  in  vitro  assays  we  incubated 
cells  in  suspension  with  conditioned  media  containing  either  AP  alone  or 
AP-CXCL14  fusion  protein,  rinsed,  and  then  assayed  for  bound  AP  activity. 
To  determine  the  effect  of  CXCL14  on  cell  growth,  we  plated  MDA-MB-231 
and  MCF10A  cells  (4000  cells/well  in  a  24-well  plate)  and  grew  them  in 
conditioned  media  containing  AP  or  AP-CXCL14.  Conditioned  media  were 
generated  by  transfecting  293  cells  with  pAP~tag5  or  pAP-CXCLI  4  plasmids 
and  growing  them  in  McCoy’s  medium  supplemented  with  10%  FBS  (to  be 
used  for  MDA-MB-231  cells)  or  in  MCF10A  media  (American  Type  Culture 
Collection,  Manassas,  VA,  to  be  used  in  MCF10A  cells).  Cells  were  counted 
(3  wells/time  point)  on  days  1,  2,  4,  6,  and  8  after  plating.  We  used  10  nM 
CXCL12  in  MDA-MB-231  cells  as  positive  control.  The  experiment  was 
repeated  three  times.  In  order  to  determine  if  CXCL14  binding  to  breast 
cancer  cells  has  an  effect  on  cell  migration  and  invasion,  we  tested  the  ability 
of  conditioned  medium  containing  AP-CXCL1 4  or  pCDNA3.1  expressing  HA- 
tagged  CXCL14  to  induce  the  migration  and  invasion  of  MDA-MB-231  cells 
using  BIOCOAT  Matrigel  invasion  chambers  essentially  as  described  (Muller 
et  al.,  2001).  For  invasion  assays,  we  plated  2.5  x  104  cells/well  and  assayed 
24  hr  later,  while  for  migration  assays  we  used  1.25  x  104  cells/well  and 
determined  cell  numbers  12  hr  later.  Conditioned  medium  of  cel  Is  transfected 
with  pAP-Tag5  or  pCDNA  3.1  empty  vectors  were  used  as  negative  control. 
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Novel  estrogen  and  tamoxifen  induced  genes  identified  by  SAGE 
(Serial  Analysis  of  Gene  Expression) 
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The  breast  cancer  promoting  effects  of  estrogen  and  the 
chemopreventive  effects  of  tamoxifen  are  thought  to  be 
mediated  by  the  estrogen  receptor,  a  ligand-dependent 
transcription  factor.  Therefore,  comprehensive  analysis  of 
gene  expression  profiles  following  estrogen  or  tamoxifen 
treatment  may  help  us  better  understand  the  role  estrogen 
plays  in  tumorigenesis.  We  utilized  SAGE  (Serial 
Analysis  of  Gene  Expression)  technology  to  identify 
genes  regulated  by  estrogen  and  tamoxifen  in  the  ZR75-1 
estrogen  dependent  breast  cancer  cell  line.  In  this  manner 
we  have  identified  several  genes  that  were  regulated  by 
estrogen  or  tamoxifen.  Here  we  report  the  identification 
and  initial  characterization  of  EIT-6  (Estrogen  Induced 
Tag-6),  a  novel  nuclear  protein  and  a  new  member  of  the 
evolutionarily  conserved  SM-20  family  of  growth  reg¬ 
ulatory  immediate-early  genes.  EIT-6  appears  to  be  a 
direct  transcriptional  target  of  the  estrogen  receptor  and 
constitutive  expression  of  EIT-6  promotes  colony  growth 
in  human  breast  cancer  cells.  These  data  indicate  that 
EIT-6  may  play  a  role  in  estrogen  induced  cell  growth. 
Oncogene  (2002)  21,  836-843.  DOI:  10.1038/sj/onc/ 
1205113 

Keywords:  estrogen;  tamoxifen;  SAGE  (Serial  Analysis 
of  Gene  Expression);  breast  cancer 


Breast  cancer  is  a  leading  cause  of  cancer  death  in 
women  of  the  Western  world  (Greenlee  et  ciL ,  2000). 
Despite  advances  in  early  detection  and  treatment, 
breast  cancer  mortality  rates  have  not  decreased 
significantly  over  the  past  few  decades.  Thus,  there  is 
a  continued  and  increasing  need  for  the  identification 
of  risk  factors  of  breast  cancer  and  molecular  targets  of 
chemoprevention  in  order  to  decrease  the  incidence  of 
this  disease.  Estrogen  plays  a  key  role  in  the 
development  of  the  normal  mammary  gland  and  in 
the  initiation  and  progression  of  breast  carcinomas 
(Nandi  et  al ,  1995).  Clinical  trials  using  anti-estrogens 
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(tamoxifen)  proved  the  importance  of  estrogens  in 
breast  tumor  development,  and  identified  tamoxifen  as 
a  breast  cancer  preventive  agent  (Fisher  et  al 1998, 
1999).  However,  there  is  little  known  about  the 
mechanisms  that  account  for  the  tumorigenic  effects 
of  estrogen  and  the  cancer  preventive  effects  of 
tamoxifen. 

The  action  of  estrogen  is  mediated  by  its  receptors 
(estrogen  receptors-ERa  and  /?),  members  of  the 
nuclear  hormone  receptor  family  and  ligand-dependent 
transcription  factors  (Katzenellenbogen,  1996;  Man- 
gelsdorf  et  al ,  1995).  Estrogen  binding  stimulates  the 
trans-activating  function  of  ER  through  its  ability  to 
facilitate  the  recruitment  of  various  receptor  binding 
co-activator  proteins  (Freedman,  1999).  These  recep¬ 
tor-co-activator  complexes  then  affect  transcription 
initiation  at  promoters  regulated  by  estrogen.  Anti¬ 
estrogens  not  only  preclude  co-activator  binding,  but 
can  facilitate  the  recruitment  of  co-repressors  and  lead 
to  active  repression  of  the  basal  expression  of  certain 
genes.  Many  of  the  co-activators  and  co-repressors  are 
cell  type  and  differentiation  stage  specific,  therefore 
their  interaction  with  ER  may  explain  the  diverse, 
sometimes  opposing  effects  of  estrogen  and  tamoxifen 
in  different  cell  types.  Several  studies  have  been 
performed  to  identify  genes  whose  expression  is 
modulated  by  estrogen  or  tamoxifen  treatment  (Char- 
pentier  et  al ,  2000;  Inadera  et  al ,  2000;  Manning  et 
al ,  1988,  1990,  1995).  Several  growth  factors,  growth 
factor  receptors,  extracellular  proteins,  immediate-early 
genes  and  cell  cycle  regulators  that  may  have  effects  on 
mammary  carcinogenesis  have  been  identified  as 
potential  targets  of  the  estrogen-signaling  pathway 
(de  Cupis  and  Favoni,  1997;  Katzenellenbogen  et  al , 
1997).  However,  none  of  these  induced  genes  can  fully 
explain  the  mitogenic  effects  of  estrogen  or  the 
chemopreventive  effects  of  tamoxifen.  In  addition, 
there  is  an  increasing  need  to  identify  new  estrogen 
and  tamoxifen  targets  that  could  be  used  as  biomarkers 
to  monitor  the  efficacy  of  cancer  treatment  and 
prevention. 

In  order  to  determine  the  global  cellular  response  of 
breast  cancer  cells  to  estrogen  and  tamoxifen,  we  have 
generated  SAGE  (Serial  Analysis  of  Gene  Expression) 
libraries  from  an  estrogen  dependent  human  breast 
cancer  cell  line  (ZR75-1)  prior  to  and  following 
estrogen  or  tamoxifen  treatment.  SAGE  enabled  us 
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to  determine  the  absolute  abundance  of  thousands  of 
different  mRNAs  simultaneously  in  a  comprehensive 
and  unbiased  way  and  to  detect  even  slight  differences 
in  expression  levels  between  samples  (Velculescu  et  al. , 
1995).  ZR75-1  cells  cultured  in  the  absence  of 
endogenous  hormones  for  7  days  were  switched  to 
fresh  medium  in  the  absence  or  presence  of  10  nM 
estradiol  or  10  fiM  4-hydroxy-tamoxifen.  Cells  were 
collected  after  16  h  and  response  to  the  hormonal 
treatment  was  confirmed  by  FACS  analysis  of  cell  cycle 
progression  and  by  Northern  blot  analysis  using 
known  estrogen  target  genes.  Estrogen  deprived 
ZR75-1  arrested  in  G1  and  G2  phases  of  the  cell 
cycle,  while  addition  of  estrogen,  and  to  a  lesser  degree 
tamoxifen,  stimulated  rapid  S  phase  entry  (data  not 
shown).  SAGE  libraries  from  untreated,  estrogen  or 
tamoxifen  treated  cells  were  generated  using  a  modified 
micro-SAGE  protocol  (Porter  et  al. ,  2001).  From  the 
three  SAGE  libraries  140  638  tags  were  obtained, 
approximately  45  000  from  each  library  (SAGE  data 
will  be  deposited  at  http://www.ncbi.nlm.nih.gov/ 
SAGE/).  Pair-wise  comparison  and  statistical  analysis 
of  these  libraries  led  to  the  identification  of  several 
estrogen  and/or  tamoxifen  induced  transcripts.  There 
were  61  tags  (33  up-regulated  and  22  down-regulated) 
that  showed  at  least  twofold  difference  (P<  0.001) 
between  the  estrogen  treated  and  control  libraries, 
while  15  tags  (nine  up-regulated  and  six  down- 
regulated)  showed  at  least  twofold  difference 
(P< 0.001)  between  the  tamoxifen  treated  and  control 
libraries.  In  addition,  we  found  22  tags  that  were 
significantly  elevated  in  the  estrogen  treated  cells  when 
compared  to  the  tamoxifen  treated  ones,  while  24  tags 
were  significantly  elevated  in  the  tamoxifen  treated 
library  compared  to  the  estrogen  treated  one.  Linking 
the  UniGene  database  to  our  SAGE  data  identified  the 
cDNAs  corresponding  to  the  SAGE  tags  in  most  of  the 
cases  (Table  1).  Genes  were  named  according  to  their 
abundance  in  the  three  SAGE  libraries:  EIT  (Estrogen 
Induced  Tag)-induced  by  estrogen,  TIT  (Tamoxifen 
Induced  Tag)  -induced  by  tamoxifen,  DET  (Differently 
Expressed  Tag)-differently  expressed  following  estrogen 
or  tamoxifen  treatment. 

Since  SAGE  tag  numbers  reflect  the  absolute 
abundance  of  the  mRNAs,  data  obtained  from 
different  experiments  performed  in  different  labora¬ 
tories  are  directly  comparable  (Velculescu  et  al. ,  1995). 
Therefore,  our  data  and  SAGE  libraries  generated 
from  untreated  or  estrogen  treated  MCF-7  cells  by 
others  (Charpentier  et  al ,  2000)  were  analysed  using  a 
clustering  algorithm  to  delineate  similarities  and 
differences  between  the  effects  of  estrogen  in  two 
different  breast  cancer  cell  lines  (Figure  la).  Correlat¬ 
ing  with  previous  studies  the  two  breast  cancer  cell 
lines  had  distinct  genes  expression  patterns  and 
demonstrated  a  discrete  transcriptional  response  to 
estrogen  treatment  (Perou  et  al ,  2000).  Among  the 
known  estrogen  target  genes,  cathepsin  D  was  induced 
by  estrogen  in  both  cells  lines,  whereas  pS2  and  cyclin 
D1  were  induced  only  in  MCF-7  cells  (Figure  la).  All 
three  SAGE  libraries  derived  from  the  same  cell  line 


were  more  similar  to  each  other,  even  after  estrogen 
treatment,  than  to  the  other  cell  line.  Interestingly, 
untreated  and  estrogen  treated  MCF-7  cells  were 
highly  similar  to  each  other,  while  tamoxifen  treated 
and  untreated  ZR75-1  cells  were  somewhat  more 
similar  to  each  other  and  distinct  from  estrogen  treated 
cells.  These  findings  indicate  that  estrogen  exerts  a 
differing  effect  depending  on  the  cellular  context,  and 
overall  there  are  relatively  few  genes  significantly 
affected  by  estrogen  or  tamoxifen  treatment  in  these 
breast  cancer  cell  lines. 

To  validate  the  result  of  the  SAGE  experiment  we 
have  generated  probes  corresponding  to  some  of  the 
cDNA  clones  and  confirmed  their  induction  by 
Northern  blot  analysis  (Figure  lb).  From  the  20 
estrogen  or  tamoxifen  induced  genes  (Table  1)  only 
one  (EIT- 10 -cathepsin  D)  had  been  implicated  as  a 
target  of  ER-transcriptional  activation,  and  three  had 
not  previously  been  described  at  all.  In  most  of  the 
cases  the  sequences  of  the  genes  provided  important 
clues  as  to  their  potential  functions  (Table  1).  In 
particular,  several  of  these  genes  are  predicted  to  be 
involved  in  the  regulation  of  cell  proliferation  and/or 
survival.  EIT-2  is  a  protein  translocase  involved  in 
importing  nuclear  encoded  proteins  to  the  mitochon¬ 
dria  (Bauer  et  al .,  1999);  EIT-4  is  a  human 
homologue  of  the  yeast  Dimlp  gene  essential  for 
mitosis  (Berry  and  Gould,  1997),  EIT-6  is  an  EST 
homologous  to  a  rat  immediate-early  gene  SM-20 
(Wax  et  al,  1994);  TIT-5  is  an  anti-apoptotic 
member  of  the  bcl-2  family  (Xu  and  Reed,  1998); 
and  DET- 15  and  DET- 16  are  both  putative  tran¬ 
scription  factors  with  anti-proliferative  activity  (Ismail 
et  al .,  1999;  Nakashiro  et  al 1998;  Shibanuma  et  al ., 
1992).  Although  several  ESTs  have  no  homology  to 
known  genes,  their  expression  pattern  in  other  SAGE 
libraries  suggests  that  they  also  might  play  a  role  in 
cell  proliferation  and/or  estrogen  mediated  responses 
(Lai  et  al. ,  1999).  For  example,  the  TIT-3  and  TIT-1 
genes  appear  to  be  elevated  in  ER  +  DCIS  (Ductal 
Carcinoma  In  Situ)  compared  to  corresponding 
normal  mammary  epithelium  and  expressed  at  much 
lower  levels  in  other  cell  types  (Porter  et  al. ,  2001). 
Interestingly,  one  of  the  tamoxifen  induced  genes, 
SULT1A  phenol  sulfo  transferase,  is  an  enzyme 
involved  in  the  metabolism  of  environmental  carci- 
nogenes  and  steroid  hormones  including  estrogen  and 
tamoxifen  (Weinshilboum  et  al .,  1997).  Recently  we 
and  others  reported  that  polymorphism  in  SULT1A1 
influences  the  age-of  onset  and  the  risk  of  breast 
cancer,  respectively  (Seth  et  al. ,  2000);  Zheng  et  al. , 
2001). 

Although  our  SAGE  analysis  led  to  the  isolation  of 
several  novel  estrogen  and/or  tamoxifen  regulated 
genes,  it  provided  no  information  on  which  of  these 
genes  might  be  key  mediators  of  the  cellular  response 
initiated  by  estrogen  and/or  tamoxifen.  However,  one 
of  the  estrogen  induced  genes,  EIT-6,  appeared  to  be 
particularly  interesting  due  to  its  relatively  high 
abundance  in  SAGE  libraries  prepared  from  hormone 
responsive  tissues  (normal  and  cancerous  mammary, 
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Figure  1  Expression  analysis  of  novel  estrogen  and  tamoxifen  induced  genes,  (a)  Variation  in  expression  of  2818  genes  in  six 
SAGE  libraries  and  dendogram  representing  similarities  in  expression  pattern  among  samples.  Only  parts  of  certain  clusters  are 
included  in  the  figure.  These  clusters  are  genes  more  abundant  in  ZR75-1  (A)  or  MCF-7  (B)  cells,  genes  with  fairly  equal  levels  in 
both  cell  lines  and  libraries  (C),  genes  induced  by  estrogen  either  in  ZR75-1  or  MCF-7  cells  (D)  and  genes  more  abundant  in 
tamoxifen  and  estrogen  treated  (E)  or  tamoxifen  treated  and  untreated  ZR75-1  (F)  cells.  Each  row  represents  a  genc/SAGE  tag, 
while  each  column  corresponds  to  a  SAGE  library.  MU,  ME3,  and  ME  10  denote  MCF-7  cells  untreated,  estrogen  treated  for  3,  or 
10  h,  respectively.  Similarly  ZU,  ZE,  and  ZT  stand  for  untreated,  estrogen,  or  tamoxifen  treated  ZR75-1  cells.  The  absolute 
abundance  of  the  SAGE  tag  in  the  library  (SAGE  tag  number)  correlates  with  red  color  intensity  (black  =  not  present -intense 
red  —  highly  abundant).  Genes  highlighted  in  red  correspond  to  known  estrogen  targets  or  genes  statistically  significantly  affected  by 
hormonal  treatment,  (b)  Correlation  of  Northern  blot  (left  panel)  and  SAGE  (right  panel;  results  for  the  indicated  genes.  Numbers 
represent  SAGE  tag  numbers.  RNA  and  SAGE  libraries  were  prepared  from  untreated  (C),  estrogen  (E)  and  tamoxifen  (T)  treated 
cells.  CATH  D  and  PR  denotes  cathcpsin  D  and  progesterone  receptor,  respectively,  (c)  In  vivo  expression  of  EIT-6  in  normal  breast 
tissue.  Histological  sections  of  normal  human  breast  tissue  were  stained  with  hematoxylin-eosin  (H&E).  Adjacent  slides  were 
hybridized  with  P33  labeled  EIT-6  anti-sense  or  sense  probes  and  visualized  using  dark-field  microscopy.  Intense  hybridization  signal 
is  detected  in  mammary  epithelial  cells  using  the  antisense,  but  not  the  sense  probe,  (d)  Analysis  of  EIT-6  mRNA  levels  in  various 
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ovarian  and  prostate  epithelium)  (Lai  et  ai ,  1999;  Lash  not  unexpected,  since  many  estrogen  targets  are 

et  al. ,  2000),  and  we  therefore  characterized  it  in  induced  in  a  cell  type  specific  manner, 

further  detail.  A  full-length  (2071  bp)  human  EIT-6  To  determine  if  the  induction  of  EIT-6  by  estrogen  is 
cDNA  was  obtained  by  analysing  and  sequencing  ESTs  ER  mediated,  we  analysed  the  effect  of  estrogen 

clones.  The  human  EIT-6  cDNA  contains  an  open  antagonists  (4-hydroxy-tamoxifen  and  ICI  182  780) 

reading  frame  (ORF)  of  1221  bp  encoding  a  predicted  on  EIT-6  mRNA  levels  (Figure  le).  Consistent  with 

protein  of  407  amino  acids  (-43  000  Daltons),  which  our  SAGE  data,  EIT-6  was  induced  only  by  estrogen, 

was  confirmed  by  in  vitro  transcription/translation  and  this  induction  was  completely  or  partially  abol- 

experiments  (data  not  shown).  By  FISH  analysis  we  ished  by  the  addition  of  ICI  182  780  or  tamoxifen, 

localized  EIT-6  to  chromosome  19ql3.1,  a  region  not  respectively.  Thus,  the  increase  in  EIT-6  mRNA  levels 

previously  implicated  in  breast  cancer  (data  not  following  estrogen  treatment  is  both  estrogen  and  ER 

shown).  We  also  analysed  the  in  vivo  abundance  of  dependent.  There  are  multiple  ways  estrogen  may  be 

the  EIT-6  mRNA  and  determined  its  expression  at  the  regulating  the  expression  levels  of  EIT-6:  (1)  direct 

cellular  level  by  mRNA  in  situ  hybridization  of  normal  transcriptional  regulation;  (2)  influencing  mRNA 

human  breast  tissue  (Figure  lc).  EIT-6  hybridization  stability;  or  (3)  indirectly  through  other  transcription 

signal  showed  fairly  even  intensity  throughout  the  factors/signaling  pathways.  Northern  blot  analysis  of 

mammary  epithelium,  while  no  significant  signal  was  EIT-6  mRNA  levels  at  different  time  points  following 

detected  in  stromal  cells.  estrogen  treatment  indicated  that  EIT-6  is  induced  by 

To  determine  how  generally  EIT-6  is  regulated  by  estrogen  at  about  the  same  time  as  cathepsin  D,  a 

estrogen,  we  performed  Northern  blot  analysis  of  known  direct  estrogen  target,  (Figure  2a).  This  result 

various  ER  + breast  and  endometrial  cancer  cell  lines  suggests  that  EIT-6,  similar  to  cathepsin  D,  may  also 

(Figure  Id).  In  addition  to  being  induced  in  ZR75-1  be  a  direct  transcriptional  target  of  ER.  Addition  of 

cells  (  —  5-fold  induction),  the  cell  line  used  for  the  transcription  inhibitors  completely  abolished  EIT-6 

generation  of  the  SAGE  libraries,  EIT-6  is  induced  in  induction  by  estrogen  indicating  that  increased  EIT-6 

BT-474  cells  (-10-fold  induction),  but  not  in  other  mRNA  levels  are  likely  due  to  increased  transcription 

estrogen  responsive  cell  lines  analysed.  This  finding  is  (data  not  shown).  To  further  investigate  if  EIT-6  is  a 


Table  1  Estrogen  and  tamoxifen  regulated  genes 


Name 

SAGE  tag 

C 

E 

T 

GenBank # 

Function 

EIT-1 

GCGG TGACAG 

1 

14 

8 

NA 

No  reliable  database  match 

EIT-2 

TACGAAGTTC 

1 

14 

9 

AF077039 

TIM17B 

EIT-3 

AATGAGTTTG 

2 

19 

13 

AF201940 

DC6  mRNA 

EIT-4 

GTCTTAACTC 

2 

18 

9 

BC001046 

Yeast  Diml  homologue 

EIT-5 

GTGGCATCAC 

5 

27 

18 

AB043104 

NolOp  snRNP 

EIT-6 

GGTGTGGAAG 

5 

26 

11 

AY040565 

SM-20  homologue 

EIT-10 

GAAATACAGT 

112 

451 

315 

X05344 

Cathepsin  D  ECM  protease 

TIT-1 

CAACGAAACC 

0 

4 

14 

NA 

No  reliable  database  match 

TIT-2 

GCGTGCTCTC 

0 

3 

11 

NA 

No  reliable  database  match 

TIT-3 

GGGGGCCCCG 

8 

28 

32 

AF004876 

Yeast  Yiflp  homologue 

TIT-4 

GGGGCCCCCT 

7 

27 

28 

Z96932 

Sjogren’s  syndrome  antigen 

TIT-5 

CCACCCCGAA 

9 

19 

34 

BC000916 

BI-I  bax  antagonist 

DET-1 

TCTCTGCAAA 

4 

15 

1 

BC000890 

Hypothetical  protein  FLJ20640 

DET-2 

TGGATCCT  CG 

3 

13 

1 

BC001239 

Hypothetical  protein  FLJ 10479 

DET-1 3 

ATGAAAACT C 

7 

1 

11 

AA524901 

ESTs  no  homology 

DET-1 4 

AGCCACCGTG 

4 

1 

12 

NA 

No  reliable  database  match 

DET-1 5 

CCCCCGCGGA 

11 

1 

12 

AF 130366 

USF2  transcription  factor 

DET-1 6 

TTTGCGGTCC 

7 

0 

11 

BC002972 

TSC-22-likc  protein 

DET-1 7 

GCTGGGGACT 

0 

1 

10 

NM  001055 

SULT  1A  sulfotransferase 

breast  (ZR75-1,  BT-474,  T47D,  MCF-7)  and  endometrial  (RL95-2,  HEC-1A)  cancer  cell  lines  following  estrogen  (E)  or  tamoxifen 
(T)  treatment.  Estrogen  receptor  (ER)  status  of  the  cell  lines  are  indicated  by  *  +  ’  and  ‘  — ’  signs,  (e)  EIT-6  expression  in  cells  treated 
with  no  drug  (C),  estrogen  (E),  tamoxifen  (T),  ICI  128  780  (ICI)  or  with  combinations  of  estrogen  and  tamoxifen  (E  +  T)  and 
estrogen  and  ICI  (E  +  ICI).  Breast  cancer  celt  lines  were  obtained  from  American  Type  Culture  Collection  and  maintained  according 
to  the  supplier.  To  assay  estrogen  responsiveness  cells  were  cultured  in  phenol  red  free  medium  (Life  Technologies)  supplemented 
with  5%  charcoal  treated  fetal  bovine  scrum  (Hyclone)  after  which  cells  were  switched  to  fresh  medium  or  fresh  medium  containing 
10  nM  estradiol  or  10  4-hydroxy-tamoxifen.  SAGE  libraries  were  generated  and  analysed  as  previously  described  (Porter  et  ai , 
2001).  Hierarchical  clustering  was  applied  to  data  using  the  Cluster  program  developed  by  Eisen  et  ai  (1998).  Data  was  log- 
transformed  and  filtered  for  at  least  one  observations  abs  Val  5  and  Maxval-Minval>5.  Using  these  settings  2818  genes  (out  of  16 
808  total)  were  included  in  the  analysis.  Results  were  displayed  with  the  TrecVicw  program  (Eisen  et  al.,  1998).  mRNA  in  situ 
hybridizations  using  a  P33  labeled  sense  or  anti-sense  EIT-6  ribo-probes  were  performed  as  described  (Rosen  et  ai,  1999).  The  effect 
of  estrogen  antagonists  was  determined  by  pre-incubating  the  cells  with  1  /um  ICI  128  780  or  10  4-hydroxy-tamoxifen  for  6  h 
followed  by  estrogen  treatment  for  an  additional  24  h.  RNA  isolation,  RT-PCR  and  Northern  blot  analyses  were  performed  as 
described  (Polyak  et  ai,  1997) 
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direct  or  indirect  target  of  ER  we  analysed  its  mRNA 
levels  following  estrogen  treatment  in  the  presence  of  a 
protein  synthesis  inhibitor  (cycloheximide)  in  BT-474 
cells  (Figure  2b).  Cycloheximide  and  estrogen  both 
increased  EIT-6  mRNA  levels  to  a  certain  degree,  but 
estrogen  treatment  in  the  presence  of  cycloheximide  led 
to  a  much  stronger  induction.  Based  on  these  results, 
the  induction  of  EIT-6  following  estrogen  treatment 
does  not  appear  to  require  new  protein  synthesis. 
Therefore,  EIT-6  is  likely  to  be  a  direct  transcriptional 
target  of  ER,  but  the  possibility  that  other  proteins  are 
also  involved  in  the  transcriptional  activation  of  EIT-6 
cannot  be  excluded. 

To  further  characterize  the  mechanism  by  which  ER 
induces  EIT-6,  first  we  determined  if  a  ~5.5  kb 
fragment  of  the  proximal  EIT-6  promoter  confers 
estrogen  responsiveness  to  a  luciferase  reporter  gene. 


Measurement  of  luciferase  activity  in  911  cells 
following  transient  transfection  of  this  construct  with 
co-transfected  ERa  demonstrated  a  modest,  but 
reproducible  induction  following  estrogen  treatment 
(Figure  2c).  Random  2-3  kb  fragments  derived  from 
the  EIT-6  genomic  region  placed  up-stream  of  a 
luciferase  gene  conferred  no  response  to  estrogen 
treatment  (data  not  shown).  Next,  we  analysed  the 
sequence  of  this  promoter  region  and  identified  several 
potential  estrogen  responsive  elements  (ERE)  closely 
resembling  the  consensus  ERE  sequence  (Figure  2d). 
To  determine  whether  these  putative  EREs  can  confer 
estrogen  responsiveness,  we  generated  various  con¬ 
structs  with  concatemers  of  each  of  these  elements  or 
their  combination  placed  up-stream  of  a  luciferase  gene 
(Figure  2d).  Measurement  of  luciferase  activity  follow¬ 
ing  transient  transfection  of  these  constructs  with  co- 
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Figure  2  EIT-6  is  a  direct  estrogen  receptor  target,  (a)  Northern  blot  analysis  of  time-course  of  EIT-6  and  Cathcpsin  D  (CATH  D) 
induction  following  estrogen  treatment  for  the  indicated  time,  (b)  Analysis  of  the  effect  of  cycloheximide  (CHX)  on  EIT-6  mRNA 
levels.  Cells  were  treated  with  ethanol  alone  (C)  or  together  with  CHX  (C  +  C),  and  estrogen  alone  (E)  or  together  with  CHX 
(E  +  C).  (c)  Representative  experiments  demonstrating  increased  luciferase  activity  following  estrogen  treatment  in  cells  transiently 
transfected  with  an  EIT-6  5.5  kb  promoter  luciferase  construct.  UT  and  E  denote  untreated  and  estrogen  treated  cells,  respectively, 
(d)  EIT-6  genomic  structure  and  reporter  constructs,  (e),  Results  of  luciferase  assays  following  transient  transfections  of  HepG2  cells 
with  the  indicated  constructs  (x-axis).  Fold  induction  by  estrogen  is  indicated  on  y-axis.  Numbers  are  average  of  three  independent 
experiments  performed  in  quadruplicate,  luciferase  activity  was  normalized  for  transfection  efficiency  by  using  the  ratio  of  luciferase 
to  /J-galactosidase  activity.  A  construct  containing  two  copies  from  the  vitillogcnin.  A  promoter  was  used  as  positive  control  (VIT) 
(McMahon  et  al .,  1999).  Estrogen  treatment,  RNA  isolation  and  Northern  blot  analysis  was  performed  as  described  above.  The 
effect  of  cycloheximide  (CHX)  was  assayed  by  treating  the  cells  with  10  ^g/ml  CHX  for  16  h  in  the  presence  or  absence  of  estrogen. 
EIT-6  promoter  luciferase  reporter  constructs  were  generated  by  subcloning  a  ^5.5  kb  BAC  (bacterial  artificial  chromosome) 
derived  fragment  of  the  EIT-6  promoter  or  concatemers  of  PCR-derived  fragments  of  the  human  EIT-6  promoter  containing  the 
putative  EREs  into  pBR-pl-luc  or  pBR-pl-TATA-luc,  respectively  (Polyak  et  al.,  1997).  Cells  were  transfected  using  FuGene6 
(Roche),  treated  with  estrogen  the  day  after  transfection  and  the  following  day  luciferase  and  /Tgalactosidase  activities  were 
determined  using  a  luciferase  assay  system  (Promega)  and  the  Aurora  GAL-XE  reporter  gene  assay  (ICN),  respectively 
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transfection  of  ERa  in  HepG2  cells  revealed  that  most 
of  them  demonstrated  some,  although  relatively  weak, 
estrogen  responsiveness  (Figure  2e).  The  most  signifi¬ 
cant  (4-5  fold)  induction  was  observed  using  the  E4 
construct  containing  three  copies  of  the  E4  ERE.  This 
ERE  is  closest  to  the  transcription  start  site  and 
contains  a  nearly  perfect  ERE  with  only  two 
mismatches  compared  to  the  consensus  sequence.  In 
the  same  experiments  two  copies  of  the  consensus  ERE 
derived  from  the  vitillogenin  promoter  (YIT)  led  to  a 
10 -11-fold  induction  in  luciferase  activity  following 
estrogen  treatment  (Figure  2e).  Therefore,  a  4- 5-fold 


induction  observed  with  the  E4  ERE  is  significant  and 
indicates  that  this  fragment  could  be  a  functional  ERE. 
Although  these  experiments  do  not  prove  that  any  of 
these  putative  EREs  are  necessary  for  the  induction  of 
EIT-6  by  estrogen,  they  do  show  that  these  elements 
can  confer  estrogen  responsiveness.  Therefore,  direct 
binding  of  ER  to  these  putative  EREs  may  be 
responsible  for  the  transcriptional  induction  of  EIT-6, 
but  other  mechanisms  cannot  be  excluded. 

EIT-6  is  a  novel  human  gene  and  although  it  is 
homologous  to  the  SM-20  rat  immediate-early  gene 
induced  by  growth  agonists  (Wax  et  al. ,  1994),  it  is  not 
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Figure  3  EIT-6:  homologues  and  putative  function,  (a)  Amino  acid  alignment  of  C-tcrminal  parts  of  human  EIT-6  (amino  acids 
260-383),  rat  SM20  (amino  acids  214-336)  and  C.  elegans  egl-9  (amino  acids  451-574)  proteins  using  MacVcctor  and  ClustalW 
alignment,  (b)  Phylogenetic  comparison  of  EIT-6  homologues.  Comparisons  were  made  using  DNAStar  and  the  Jotun  Hein 
algorithm,  (c)  Subccllular  localization  of  EIT-6.  Cells  transfected  with  constructs  encoding  control  GFP  and  mito-GFP  or  EIT-6- 
GFP  proteins  were  visualized  with  fluorescence  microscopy.  Magnification  objective  20  x  .  (d)  Representative  flasks  from  colony 
assay  experiments,  (e)  Quantification  of  colony  assay  experiments.  Presence  (  +  )  or  absence  (— )  of  estrogen  and  plasmids 
transfected  (pCEP4  or  pCEP4-EIT6)  arc  indicated  on  the  x  axis,  while  colony  numbers  (per  cm2)  are  plotted  on  the  y  axis.  Results 
arc  the  average  of  three  independent  experiments.  For  intracellular  localization  studies  EIT-6-GFP  fusion  fragment  was  PCR 
amplified  as  described  (Flatt  et  al .,  2000)  and  subcloned  into  pShuttlc-CMV  construct  (He  et  al .,  1998).  Cells  were  transfected  with 
pEGFP-N,  pEGFP-mito  (Clonctcch)  or  pShuttlc-EIT-6-GFP  plasmids  and  analysed  by  fluorescence  microscopy.  To  assay  colony 
growth  expression  construct  was  generated  by  subcloning  a  PCR  derived  EIT-6  cDNA  with  a  C-terminal  double  hemagglutinin 
(HA2)  tag  into  pCEP4  (Invitrogen).  T47D  cells  were  transfected  with  FuGcnc6  (Roche)  and  selected  in  hygromycin  containing 
medium  for  2  weeks  after  which  colonies  were  visualized  by  crystal  violet  staining.  The  number  of  colonies/cm2  was  determined 
based  on  spot-densitometry  assisted  counting  using  a  Multilmagc  Lightbox  (Alfa  Innotcch).  Two  independent  areas/flask  and  two 
flasks/experiments  were  analysed.  In  some  experiments  the  cells  were  grown  in  regular  10%  fetal  bovine  serum  (FBS)  containing 
RPMI  medium,  while  in  other  experiments  cells  were  cultured  in  phenol  red  free  RPMI  medium  (Life  Technologies)  supplemented 
with  5%  charcoal  treated  FBS  (Hyclonc)  in  the  presence  or  absence  or  10  nM  estradiol 


Oncogene 


842 


Estrogen  and  tomoxrfen  induced  genes 

P  Seth  et  al 


the  human  orthologue  of  this  rat  gene  (Figure  3a). 
However,  similar  to  SM-20,  EIT-6  is  also  induced  by 
various  growth  agonists  (EGF,  isoproterenol,  and  PMA) 
in  ZR75-1  cells  (data  not  shown).  The  rat  SM-20  cDNA 
was  also  identified  as  a  gene  induced  by  wild-type  p53 
and  as  a  gene  induced  in  sympathetic  neurons  during 
NGF  (Nerve  Growth  Factor)  withdrawal  initiated 
apoptosis  (Lipscomb  et  al ,  1999;  Madden  et  al ,  1996). 
Subsequent  experiments  in  muscle  cells  determined  that 
SM-20  may  play  a  role  in  the  regulation  of  myoblast 
proliferation  and  differentiation  (Moschella  et  al ,  1999). 
Interestingly,  a  recent  study  isolated  another  SM-20/ 
EIT-6  homologue  as  a  gene  significantly  overexpressed  in 
over  50%  of  endometrial  carcinomas,  the  development 
of  which  is  thought  to  depend  on  estrogenic  hormones 
(Foca  et  al,  2000).  Based  on  these  results  we  can 
conclude  that  EIT-6  is  a  member  of  a  multi  gene  family 
of  growth  regulatory  genes  that  includes  SM-20,  EIT-6, 
and  additional  SM-20  homologues.  EIT-6  appears  to 
encode  a  protein  with  evolutionarily  conserved  function: 
there  is  a  C.  elegans  EIT-6  homologue  (Figure  3a, b)  that 
was  identified  as  an  egg  laying  defective  mutant  (egl-9) 
(Trent  et  al,  1983),  and  several  additional  SM-20/EIT-6 
homologues  were  identified  from  various  other  species 
including  several  types  of  bacteria  (Figure  3b). 

Surprisingly  immunohistochemical  analysis  of  the  rat 
SM-20  gene  demonstrated  cytoplasmic  staining,  while 
the  human  SM-20  orthologue,  another  related  human 
gene  (SCAND2),  and  EIT-6  all  contain  putative  nuclear 
localization  signals  (Dupuy  et  al,  2000;  Wax  et  al,  1994). 
However,  most  of  the  homology  between  EIT-6  and  SM- 
20  resides  in  the  C-terminal  region,  while  the  N-terminal 
domain  (containing  the  nuclear  localization  sequence) 
are  divergent.  Therefore,  to  determine  the  sub-cellular 
localization  of  EIT-6,  we  generated  a  construct  expres¬ 
sing  an  EIT-6-GFP  (Green  Fluorescence  Protein)  fusion 
protein.  Fluorescence  microscopic  analysis  of  cells 
transiently  transfected  with  control  GFP  and  mitochon- 
drial-GFP  (mito-GFP)  encoding  constructs  revealed 
mostly  cytoplasmic  and  mitochondrial  localization, 
respectively  (Figure  3c).  In  contrast,  the  EIT-6-GFP 
protein  was  detected  only  in  the  nucleus.  Similar  results 
were  obtained  by  Western  blot  analysis  of  fractionated 
cell  extracts  prepared  from  cells  expressing  a  hemaglu- 
tinine  epitope  tagged  EIT-6  protein  (data  not  shown). 
We  were  unable  to  determine  the  localization  of  the 
endogenous  EIT-6  protein,  since  the  polyclonal  anti¬ 
bodies  we  generated  were  not  suitable  for  immunohisto¬ 
chemical  analysis.  However,  based  on  these  data  EIT-6  is 
likely  to  be  located  and  function  in  the  nucleus. 

Based  on  its  homology  to  SM-20  we  hypothesized 
that  EIT-6  expression  influences  cell  proliferation  and / 
or  survival.  In  order  to  test  this,  we  transfected  T47D 
ER  +  breast  cancer  cells  with  control  (pCEP4)  or 
pCEP4-EIT6  (encoding  a  C-terminal  hemaglutinin 
epitope  tagged  EIT-6  protein)  constructs.  Stable 
transfectants  were  selected  by  culturing  the  cells  in 
the  presence  of  hygromycin  for  2  weeks  after  which 
colonies  were  visualized  by  crystal  violet  staining.  The 
expression  of  EIT-6  was  confirmed  by  immunoblot 
analysis  of  cell  extracts  prepared  from  pools  of  stable 


clones  using  anti-HA  antibody  (data  not  shown).  In 
three  independent  experiments  performed  in  duplicate 
flasks,  expression  of  EIT-6  led  to  a  significant  (3-4- 
fold)  increase  in  colony  numbers  (representative 
examples  Figure  3d,  summary  of  colony  counts  in 
Figure  3e).  Similar  results  were  obtained  in  MDA-MB- 
43 5S  ER  negative  breast  cancer  cells  (data  not  shown). 
These  data  indicate  that  EIT-6  overexpression  en¬ 
hances  colony  growth  in  human  breast  cancer  cells. 

To  test  if  expression  of  EIT-6  can  confer  estrogen 
independent  growth,  we  transfected  the  ER  +  and  estro¬ 
gen  dependent  T47D  breast  cancer  cells  with  control 
pCEP4  or  pCEP4-EIT6  constructs.  Stable  transfectants 
were  selected  by  culturing  the  cells  in  the  absence  of 
hormones  in  phenol  red  free  medium  supplemented  with 
5%  charcoal/dextran  treated  fetal  bovine  serum  and 
hygromycin.  Half  of  the  flasks  were  cultured  in  the 
presence  of  10  nM  estrogen.  Colonies  were  visualized  by 
crystal  violet  staining  after  2  weeks  of  selection.  Very  few 
colonies  were  observed  in  the  absence  of  estrogen  in 
control  pCEP4  transfected  cells  confirming  the  require¬ 
ment  of  estrogen  for  T47D  cell  growth  (Figure  3e).  In 
contrast,  a  significant  number  of  colonies  was  observed  in 
EIT-6  transfected  cells  in  the  absence  of  estrogen 
indicating  that  EIT-6  expression  relieves  estrogen 
dependency  (Figure  3e).  Addition  of  estrogen  increased 
colony  numbers  in  both  control  pCEP4  and  EIT-6 
transfected  cells,  therefore,  EIT-6  expression  may  not  be 
sufficient  to  completely  alleviate  estrogen  dependence 
(Figure  3e).  Due  to  the  lack  of  suitable  antibodies  we  were 
unable  to  determine  the  relative  levels  of  ectopic  and 
endogenous  EIT-6  proteins  in  these  clones.  Therefore,  the 
possibility  that  increased  colony  numbers  were  due  to 
exogenous  EIT-6  protein  levels  significantly  exceeding 
endogenous  estrogen  induced  EIT-6  protein  levels  cannot 
be  excluded. 

In  summary,  based  on  SAGE  analysis  of  estrogen 
and  tamoxifen  treated  breast  cancer  cells,  we  identified 
several  novel  putative  targets  of  the  estrogen  signaling 
pathway.  One  of  these  genes,  EIT-6,  may  be  involved 
in  transmitting  growth  promoting  signals  initiated  by 
estrogen  in  human  breast  cancer  cells.  EIT-6  has  no 
known  functional  domains  and  EIT-6  homologues  are 
not  well  characterized,  therefore,  we  can  only  speculate 
on  how  EIT-6  may  regulate  estrogen  dependent  and 
independent  cell  growth.  Since  EIT-6  is  a  nuclear 
protein  one  attractive  hypothesis  is  that  EIT-6  may 
influence  the  transcription  of  genes  involved  in  the 
regulation  of  cell  proliferation  or  survival.  However, 
further  studies  are  required  to  determine  if  EIT-6  or 
any  of  the  other  genes  are  critical  downstream 
components  of  the  estrogen  signaling  pathway. 


Note  added  in  proof 

During  the  review  of  this  manuscript  two  independent 
studies  (Epstein  et  al.  Cell ,  107,  43-54  and  Bruick  et  al , 
Science,  294,  1337-1340)  identified  EIT-6  as  a  dioxygenase 
that  may  regulate  HIF  (hypoxia  inducible  factor)  by 
hydroxylation. 
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